From fyang at openjdk.org Mon Apr 1 00:22:31 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 1 Apr 2024 00:22:31 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 19:35:45 GMT, Vladimir Kozlov wrote: > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. Hi, I also performed some tests (tier1-3 and hotspot:tier4) on linux-riscv64 platform. Result looks good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2028963874 From sjayagond at openjdk.org Mon Apr 1 07:35:39 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 1 Apr 2024 07:35:39 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: <1SGOMkL6TvnkQDt1WkH3FbPVrbCUOD_cA3e23QK5-jg=.b9b066f9-d50a-4710-a8a5-76c2d9b83236@github.com> Message-ID: On Wed, 27 Mar 2024 13:24:39 GMT, Martin Doerr wrote: >>> I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly. >> >> Agreed. My proposed circumvention is a too dirty hack. >> >> I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. >> >> Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the >> >> if (VM_Version::has_VectorFacility()) { >> } >> >> block. I would not like that, though. > >> > I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly. >> >> Agreed. My proposed circumvention is a too dirty hack. >> >> I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. >> >> Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the >> >> ``` >> if (VM_Version::has_VectorFacility()) { >> } >> ``` >> >> block. I would not like that, though. > > I didn't mean to back out the whole commit. Only the implementation of string_compress and string_expand. The benefit of the vector version certainly depends on what kind of strings are used. (Effect may also be negative in some cases.) I think that classical benchmarks didn't show a significant performance impact, but I don't remember exactly, either. I'll leave the s390 maintainers free to decide if they want to adapt the vector version or go for the short and simple implementation. @TheRealMDoerr and @RealLucy Just for my understanding why GPR and FPR doesn't get affected in intrinsic code as they are also allocated outside of register allocator? why only vector registers usage in intrinsic code get affected or am I missing anything here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2029313646 From shade at openjdk.org Mon Apr 1 07:36:00 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 1 Apr 2024 07:36:00 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v3] In-Reply-To: References: Message-ID: > We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. > > There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. > > Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. > > It also allows to remove the related Zero kludge. > > Additional testing: > - [x] Linux AArch64 server fastdebug, `all` tests > - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into JDK-8329134-tlab-zapping - Review comments - Touchups - Also remove Zero kludge - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18500/files - new: https://git.openjdk.org/jdk/pull/18500/files/72bf1e8a..5bf36d05 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18500&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18500&range=01-02 Stats: 6085 lines in 208 files changed: 2917 ins; 2291 del; 877 mod Patch: https://git.openjdk.org/jdk/pull/18500.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18500/head:pull/18500 PR: https://git.openjdk.org/jdk/pull/18500 From mdoerr at openjdk.org Mon Apr 1 09:04:36 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 1 Apr 2024 09:04:36 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: <1SGOMkL6TvnkQDt1WkH3FbPVrbCUOD_cA3e23QK5-jg=.b9b066f9-d50a-4710-a8a5-76c2d9b83236@github.com> Message-ID: On Wed, 27 Mar 2024 13:24:39 GMT, Martin Doerr wrote: >>> I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly. >> >> Agreed. My proposed circumvention is a too dirty hack. >> >> I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. >> >> Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the >> >> if (VM_Version::has_VectorFacility()) { >> } >> >> block. I would not like that, though. > >> > I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly. >> >> Agreed. My proposed circumvention is a too dirty hack. >> >> I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. >> >> Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the >> >> ``` >> if (VM_Version::has_VectorFacility()) { >> } >> ``` >> >> block. I would not like that, though. > > I didn't mean to back out the whole commit. Only the implementation of string_compress and string_expand. The benefit of the vector version certainly depends on what kind of strings are used. (Effect may also be negative in some cases.) I think that classical benchmarks didn't show a significant performance impact, but I don't remember exactly, either. I'll leave the s390 maintainers free to decide if they want to adapt the vector version or go for the short and simple implementation. > @TheRealMDoerr and @RealLucy Just for my understanding why GPR and FPR doesn't get affected in intrinsic code as they are also allocated outside of register allocator? why only vector registers usage in intrinsic code get affected or am I missing anything here? GPRs and FPRs already have an `effect` specified in the match rules. (If a GPR or FPR is used by a `MachNode` without proper specification, it is a critical bug.) Before this PR, it is legal to use VRs without `effect` because they are not used by register allocation. This is exploited in `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)`. With your change, these 2 intrinsics may overwrite live values! Unfortunately, they use many VRs. So, specifying a `KILL` effect for all of them may cause the register allocation to insert much spill code. May impact performance and code size. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2029440755 From amitkumar at openjdk.org Mon Apr 1 12:07:31 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 1 Apr 2024 12:07:31 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 19:35:45 GMT, Vladimir Kozlov wrote: > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. I performed the build + testing `{fastdebug, release, slowdebug} X {tier1}` on `s390x` and result looks fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2029655163 From coleenp at openjdk.org Mon Apr 1 12:15:57 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 1 Apr 2024 12:15:57 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass Message-ID: This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). ------------- Commit messages: - Add Safepoint assert, don't use order accessors because they're not needed. - 8313332: Simplify lazy jmethodID cache in InstanceKlass Changes: https://git.openjdk.org/jdk/pull/18549/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313332 Stats: 222 lines in 7 files changed: 41 ins; 144 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/18549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18549/head:pull/18549 PR: https://git.openjdk.org/jdk/pull/18549 From shade at openjdk.org Mon Apr 1 17:29:45 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 1 Apr 2024 17:29:45 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v3] In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 07:36:00 GMT, Aleksey Shipilev wrote: >> We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. >> >> There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. >> >> Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. >> >> It also allows to remove the related Zero kludge. >> >> Additional testing: >> - [x] Linux AArch64 server fastdebug, `all` tests >> - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into JDK-8329134-tlab-zapping > - Review comments > - Touchups > - Also remove Zero kludge > - Fix Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18500#issuecomment-2030202957 From shade at openjdk.org Mon Apr 1 17:29:45 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 1 Apr 2024 17:29:45 GMT Subject: Integrated: 8329134: Reconsider TLAB zapping In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 21:08:16 GMT, Aleksey Shipilev wrote: > We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. > > There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. > > Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. > > It also allows to remove the related Zero kludge. > > Additional testing: > - [x] Linux AArch64 server fastdebug, `all` tests > - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass This pull request has now been integrated. Changeset: 5698f7ad Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/5698f7ad29c939b7e52882ace575dd7113bf41de Stats: 30 lines in 5 files changed: 5 ins; 16 del; 9 mod 8329134: Reconsider TLAB zapping Reviewed-by: stefank, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/18500 From dlong at openjdk.org Mon Apr 1 18:18:00 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 1 Apr 2024 18:18:00 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 19:35:45 GMT, Vladimir Kozlov wrote: > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. The `not_used` state was introduced for AOT. It can go away now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2030282409 From simonis at openjdk.org Mon Apr 1 19:26:17 2024 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 1 Apr 2024 19:26:17 GMT Subject: RFR: 8329421: Native methods can not be selectively printed Message-ID: Native methods (i.e. "native wrappers") can not be selectively printed with `-XX:CompileCommand=print,class::method`. Currently the only way to print native methods is to use the global `-XX:+PrintAssembly` option. But this prints *all* compiled methods which can be too much if we're just interested in a specific native wrapper. There's no reason to not apply `-XX:CompileCommand` options correctly to native methods as well. ------------- Commit messages: - 8329421: Native methods can not be selectively printed Changes: https://git.openjdk.org/jdk/pull/18567/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18567&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329421 Stats: 101 lines in 2 files changed: 78 ins; 13 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/18567.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18567/head:pull/18567 PR: https://git.openjdk.org/jdk/pull/18567 From kvn at openjdk.org Mon Apr 1 19:38:59 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 1 Apr 2024 19:38:59 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 00:19:32 GMT, Fei Yang wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Hi, I also performed some tests (tier1-3 and hotspot:tier4) on linux-riscv64 platform. Result looks good. @RealFYang and @offamitkumar thank you for testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2030425253 From kvn at openjdk.org Mon Apr 1 19:52:59 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 1 Apr 2024 19:52:59 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 18:15:43 GMT, Dean Long wrote: > The `not_used` state was introduced for AOT. It can go away now. Good catch, Dean. I want to keep `nmethod::make_not_used()` method because we use it in Leyden to keep AOT code (outside of CodeCache): [nmethod.hpp#L476](https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/code/nmethod.hpp#L476) It does not use this flag value. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2030448462 From kvn at openjdk.org Mon Apr 1 20:48:07 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 1 Apr 2024 20:48:07 GMT Subject: RFR: 8329421: Native methods can not be selectively printed In-Reply-To: References: Message-ID: <-9Z5z2g9zmDgf87_aasOqwdXMxGQG6_QOVA1e0P03Kc=.5b90de09-a6c4-4382-937f-9b8de96d0673@github.com> On Mon, 1 Apr 2024 19:20:53 GMT, Volker Simonis wrote: > Currently the only way to print native methods is to use the global -XX:+PrintAssembly option You did not list `-XX:+PrintNativeNMethods` which I assume will print native wrappers. All of them. So your changes are valid. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18567#issuecomment-2030525460 From kvn at openjdk.org Mon Apr 1 20:51:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 1 Apr 2024 20:51:00 GMT Subject: RFR: 8329421: Native methods can not be selectively printed In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 19:20:53 GMT, Volker Simonis wrote: > Native methods (i.e. "native wrappers") can not be selectively printed with `-XX:CompileCommand=print,class::method`. Currently the only way to print native methods is to use the global `-XX:+PrintAssembly` option. But this prints *all* compiled methods which can be too much if we're just interested in a specific native wrapper. There's no reason to not apply `-XX:CompileCommand` options correctly to native methods as well. src/hotspot/share/runtime/sharedRuntime.cpp line 2790: > 2788: } > 2789: > 2790: DirectiveSet* directive = DirectivesStack::getMatchingDirective(method, CompileBroker::compiler(CompLevel_simple)); Will it work for when `-XX:+PrintAssembly` used and not `-XX:CompileCommand=print,`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18567#discussion_r1546829699 From kvn at openjdk.org Mon Apr 1 21:07:31 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 1 Apr 2024 21:07:31 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: Message-ID: > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Removed not_used state of nmethod ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18554/files - new: https://git.openjdk.org/jdk/pull/18554/files/7635b333..246ff68a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18554.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18554/head:pull/18554 PR: https://git.openjdk.org/jdk/pull/18554 From kvn at openjdk.org Mon Apr 1 21:12:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 1 Apr 2024 21:12:00 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 21:07:31 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Removed not_used state of nmethod I did not change `src/hotspot/share/code//codeHeapState.cpp` code which counts nmethods with `not_used` state by checking `(!nm->is_not_entrant()` after `(nm->is_in_use())`. Removing `not_used` does not affect it. The code is complicated and needs separate RFE if we decide to clean it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2030560338 From sgibbons at openjdk.org Mon Apr 1 21:30:19 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 1 Apr 2024 21:30:19 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory`. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Use non-sse fill (old left in) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/401a2a96..c5cb30cc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=00-01 Stats: 795 lines in 2 files changed: 765 ins; 12 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From amenkov at openjdk.org Mon Apr 1 23:30:59 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Mon, 1 Apr 2024 23:30:59 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 15:25:48 GMT, Coleen Phillimore wrote: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). Looks like good simplification ------------- Marked as reviewed by amenkov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18549#pullrequestreview-1972336247 From sspitsyn at openjdk.org Tue Apr 2 00:29:18 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 Apr 2024 00:29:18 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake Message-ID: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. Testing: Ran mach5 tiers 1-6 ------------- Commit messages: - 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake Changes: https://git.openjdk.org/jdk/pull/18570/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18570&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329432 Stats: 41 lines in 3 files changed: 13 ins; 20 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18570/head:pull/18570 PR: https://git.openjdk.org/jdk/pull/18570 From dholmes at openjdk.org Tue Apr 2 02:18:58 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 2 Apr 2024 02:18:58 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 1 Apr 2024 21:30:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory`. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Use non-sse fill (old left in) This looks like it is still a Draft/work-in-progress. There is only code for x64 and it doesn't appear it will build on other platforms. Also there are still a bunch of `if 0` in the code that should not be there. ------------- PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-1972492070 From duke at openjdk.org Tue Apr 2 02:33:58 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 2 Apr 2024 02:33:58 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 06:54:01 GMT, kuaiwei wrote: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. update: I added finite state machine for merging instruction. The patch can be found in https://github.com/openjdk/jdk/commit/1b18e8298b1ef8778b494fb7ed9e4467e0a9a6b8 . Because instructions are pending, I need modify Assembler::pc() and offset() to count the pending instruction size. It may impact relocation. I fixed some failure and still some test failure to be figure out. I'm working on them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2030964309 From epeter at openjdk.org Tue Apr 2 06:43:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 2 Apr 2024 06:43:59 GMT Subject: RFR: 8328997: Remove unnecessary template parameter lists in GrowableArray In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 23:55:43 GMT, Kim Barrett wrote: > Please review this change to the GrowableArray code to remove unnecessary > template parameter lists. They aren't needed, and some may become > syntactically invalid in the future. > > Testing: mach5 tier1. Looks reasonable :) ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18480#pullrequestreview-1972744465 From kbarrett at openjdk.org Tue Apr 2 07:00:04 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 07:00:04 GMT Subject: RFR: 8328997: Remove unnecessary template parameter lists in GrowableArray In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 11:38:30 GMT, Ivan Walulya wrote: >> Please review this change to the GrowableArray code to remove unnecessary >> template parameter lists. They aren't needed, and some may become >> syntactically invalid in the future. >> >> Testing: mach5 tier1. > > Marked as reviewed by iwalulya (Reviewer). Thanks for reviews @walulyai and @eme64 . ------------- PR Comment: https://git.openjdk.org/jdk/pull/18480#issuecomment-2031212853 From kbarrett at openjdk.org Tue Apr 2 07:00:05 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 07:00:05 GMT Subject: Integrated: 8328997: Remove unnecessary template parameter lists in GrowableArray In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 23:55:43 GMT, Kim Barrett wrote: > Please review this change to the GrowableArray code to remove unnecessary > template parameter lists. They aren't needed, and some may become > syntactically invalid in the future. > > Testing: mach5 tier1. This pull request has now been integrated. Changeset: 3d228380 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/3d2283800acee58dbf046c8b401a5a144ab65ed1 Stats: 16 lines in 1 file changed: 0 ins; 0 del; 16 mod 8328997: Remove unnecessary template parameter lists in GrowableArray Reviewed-by: iwalulya, epeter ------------- PR: https://git.openjdk.org/jdk/pull/18480 From simonis at openjdk.org Tue Apr 2 07:23:25 2024 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 2 Apr 2024 07:23:25 GMT Subject: RFR: 8329421: Native methods can not be selectively printed [v2] In-Reply-To: References: Message-ID: > Native methods (i.e. "native wrappers") can not be selectively printed with `-XX:CompileCommand=print,class::method`. Currently the only way to print native methods is to use the global `-XX:+PrintAssembly` option. But this prints *all* compiled methods which can be too much if we're just interested in a specific native wrapper. There's no reason to not apply `-XX:CompileCommand` options correctly to native methods as well. Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: Add test for -XX:+PrintNativeNMethods ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18567/files - new: https://git.openjdk.org/jdk/pull/18567/files/49ae8d51..0f9f0ac6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18567&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18567&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18567.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18567/head:pull/18567 PR: https://git.openjdk.org/jdk/pull/18567 From simonis at openjdk.org Tue Apr 2 07:23:25 2024 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 2 Apr 2024 07:23:25 GMT Subject: RFR: 8329421: Native methods can not be selectively printed [v2] In-Reply-To: References: Message-ID: <3aNkveDwk2zHGfba8iCKMdQUds0VjZbbz_hNUwWGx3s=.5189f846-25fc-40d0-b725-0bdda8e7cedf@github.com> On Mon, 1 Apr 2024 20:48:10 GMT, Vladimir Kozlov wrote: >> Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: >> >> Add test for -XX:+PrintNativeNMethods > > src/hotspot/share/runtime/sharedRuntime.cpp line 2790: > >> 2788: } >> 2789: >> 2790: DirectiveSet* directive = DirectivesStack::getMatchingDirective(method, CompileBroker::compiler(CompLevel_simple)); > > Will it work for when `-XX:+PrintAssembly` used and not `-XX:CompileCommand=print,`? Yes, it does, because `-XX:+PrintAssembly` sets `PrintAssembly` to true in the default compiler directives so `DirectivesStack::getMatchingDirective()` will return true for every method. There's also a test for this case in `NativeCalls.java`: new Variant(List.of("-XX:-TieredCompilation", "-XX:+UnlockDiagnosticVMOptions", "-XX:+PrintAssembly"), "true", "true"), I've now also added another test for `-XX:+PrintNativeNMethods` which as you've correctly observed, also prints native methods, but all of them: new Variant(List.of("-XX:-TieredCompilation", "-XX:+UnlockDiagnosticVMOptions", "-XX:+PrintNativeNMethods"), "true", "true"), ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18567#discussion_r1547279619 From jzhu at openjdk.org Tue Apr 2 08:03:00 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Tue, 2 Apr 2024 08:03:00 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 16:52:02 GMT, Stuart Monteith wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Thanks, that helps - I can see you're saving/restoring the correct register lengths. Would it be possible to generate a testcase to test that registers are being saved/restored correctly? > > The following is a testcase that is an example of where this testing is done, although in this PR's case it isn't subroutines, but load/store barriers: > > https://github.com/openjdk/jdk/commit/4cd318756d4a8de64d25fb6512ecba9a008edfa1#diff-949a4a2f889be36be47e9b02b6d6cd1247768953b95a024f649878bac721fa04 @stooart-mon Thanks for your review. Please let me know if you have any other comments. @fisk I would appreciate it if you could share your comments on this change since it follows your previous work done for x86. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-2031330044 From dnsimon at openjdk.org Tue Apr 2 08:12:02 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 2 Apr 2024 08:12:02 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <4ECS4yQ0YXQVSt352CQhkQ4dax4VBYv6ZXzK9eBIio0=.baabfef1-2b9e-4b2f-ba0f-e358f6a83af1@github.com> On Mon, 1 Apr 2024 21:30:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory`. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Use non-sse fill (old left in) Wouldn't it be better to do this intrinsification directly in the JIT without calling out to a stub? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2031348841 From sspitsyn at openjdk.org Tue Apr 2 08:19:03 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 Apr 2024 08:19:03 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake Message-ID: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. Testing: - Ran mach5 tiers 1-6 ------------- Commit messages: - 8329491: GetThreadListStackTraces function should use JvmtiHandshake Changes: https://git.openjdk.org/jdk/pull/18574/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18574&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329491 Stats: 42 lines in 3 files changed: 16 ins; 19 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18574/head:pull/18574 PR: https://git.openjdk.org/jdk/pull/18574 From jkern at openjdk.org Tue Apr 2 09:02:01 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 09:02:01 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 05:23:57 GMT, Julian Waters wrote: > > The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf > > Did you mean compilerWarnings_gcc.hpp? Yes, you're right. I fixed it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18536#issuecomment-2031447588 From jkern at openjdk.org Tue Apr 2 09:16:59 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 09:16:59 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 07:59:05 GMT, Thomas Stuefe wrote: >> While looking at this, I noticed that my question in https://github.com/openjdk/jdk/pull/14146#discussion_r1207078176 and followups had never been answered. Do you know the answers now? >> >> Quoting myself: >> >>> So, we do this only for malloc? Not for calloc, posix_memalign, realloc etc? What about free? >>> Removing that define and hard-coding it here assumes ... pointers it returns work with the unchanged free() and realloc() the system provides, and will always do so. >>> I am basically worried that undefining malloc, even if it seems harmless now, exposes us to difficult-to-investigate problems down the road, since it depends on how the libc devs will reform those macros in the future. > > Other than that, and kind of depending on your answer: How important is it that we catch every use of the original malloc? Can be safely mix the original malloc with vec_malloc if logging is not involved? > > I am asking, because from that it depends whether this hunk needs to appear right behind `#include ` or whether we can move it into the middle of the file together with the other AIX stuff. > > Because, if we move it into the middle of the file, we may miss any uses of malloc that may happen in system headers (would be unusual for that to happen but with IBM one never knows). Hi Thomas, I would like to get totally rid of this, because as I mentioned IBM already modified the `stdlib.h` header not using `#define malloc vec_malloc` any more (and all the other vec_... defines). We have to ask the adoptium colleagues at IBM if they already have raised their build environment by the 2 SP levels needed. In principle we had to do the same workaround for `calloc, free,...` too, but they didn't show up as errors in the logging files. These lines where never meant to stay for long. Just to be able to compile until IBM fixes the issue, which is done now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547465986 From duke at openjdk.org Tue Apr 2 09:18:00 2024 From: duke at openjdk.org (ExE Boss) Date: Tue, 2 Apr 2024 09:18:00 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 1 Apr 2024 21:30:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory`. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Use non-sse fill (old left in) src/hotspot/share/opto/library_call.hpp line 235: > 233: bool inline_unsafe_copyMemory(); > 234: > 235: bool inline_unsafe_setMemory(); Maybe?remove the?empty?line between?these `inline_unsafe_*Memory` methods? Suggestion: bool inline_unsafe_copyMemory(); bool inline_unsafe_setMemory(); src/hotspot/share/prims/unsafe.cpp line 391: > 389: size_t sz = (size_t)size; > 390: > 391: Suggestion: ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1546092398 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1546093297 From jkern at openjdk.org Tue Apr 2 09:21:59 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 09:21:59 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 09:14:10 GMT, Joachim Kern wrote: >> Other than that, and kind of depending on your answer: How important is it that we catch every use of the original malloc? Can be safely mix the original malloc with vec_malloc if logging is not involved? >> >> I am asking, because from that it depends whether this hunk needs to appear right behind `#include ` or whether we can move it into the middle of the file together with the other AIX stuff. >> >> Because, if we move it into the middle of the file, we may miss any uses of malloc that may happen in system headers (would be unusual for that to happen but with IBM one never knows). > > Hi Thomas, > I would like to get totally rid of this, because as I mentioned IBM already modified the `stdlib.h` header not using `#define malloc vec_malloc` any more (and all the other vec_... defines). We have to ask the adoptium colleagues at IBM if they already have raised their build environment by the 2 SP levels needed. > In principle we had to do the same workaround for `calloc, free,...` too, but they didn't show up as errors in the logging files. > These lines where never meant to stay for long. Just to be able to compile until IBM fixes the issue, which is done now. @suchismith1993 Hi Suchi, can you please tell me when you will raise your build environment from AIX 7.2 TL5 SP5 to SP7? I' am asking you, because I want to get rid of this nasty workaround. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547473723 From jkern at openjdk.org Tue Apr 2 10:28:59 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 10:28:59 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 07:21:43 GMT, Thomas Stuefe wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/os/aix/os_aix.cpp line 314: > >> 312: ErrnoPreserver ep; >> 313: log_trace(os, map)("disclaim failed: " RANGEFMT " errno=(%s)", >> 314: RANGEFMTARGS(p, (long)maxDisclaimSize), > > Wait, why are these casts needed? maxDisclaimSize is size_t, RANGEFMT uses SIZE_FORMAT. That should work without cast. Hi Thomas, `maxDisclaimSize` is of type `unsigned int`; therefore I get the following warning: os/aix/os_aix.cpp:314:42: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] RANGEFMTARGS(p, maxDisclaimSize), ^~~~~~~~~~~~~~~ Should I keep the casts, or change the type of `maxDisclaimSize, numFullDisclaimsNeeded, lastDisclaimSize` to `const unsigned long`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547578012 From jkern at openjdk.org Tue Apr 2 10:49:01 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 10:49:01 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: <8yq0NeIit-6Q3aB1IF3MqNZnT4B4hWd7LnKYlsX6NPk=.38fd4342-e553-4a12-8e69-ca95b3cccb09@github.com> References: <8yq0NeIit-6Q3aB1IF3MqNZnT4B4hWd7LnKYlsX6NPk=.38fd4342-e553-4a12-8e69-ca95b3cccb09@github.com> Message-ID: On Fri, 29 Mar 2024 07:19:33 GMT, Thomas Stuefe wrote: >> src/hotspot/os/aix/loadlib_aix.cpp line 120: >> >>> 118: (lm->is_in_vm ? '*' : ' '), >>> 119: (uintptr_t)lm->text, (uintptr_t)lm->text + lm->text_len, >>> 120: (uintptr_t)lm->data, (uintptr_t)lm->data + lm->data_len, >> >> Please don't cast, use `p2i()`. > > Check copyrights in this file and all others. Adapt SAP and Oracle copyrights. Done + will adopt copyrights >> src/hotspot/os/aix/os_aix.cpp line 651: >> >>> 649: lt.print("Thread is alive (tid: " UINTX_FORMAT ", kernel thread id: " UINTX_FORMAT >>> 650: ", stack [" PTR_FORMAT " - " PTR_FORMAT " (" SIZE_FORMAT "k using %luk pages)).", >>> 651: os::current_thread_id(), (uintx) kernel_thread_id, (uintptr_t)low_address, (uintptr_t)high_address, >> >> Use p2i, not cast > > Here, and in other places too where you cast a pointer to fit into PTR_FORMAT or INTPTR_FORMAT Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547607793 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547606610 From mli at openjdk.org Tue Apr 2 10:49:04 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 2 Apr 2024 10:49:04 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <2kzCAM7nbD5d4nnO5NN2ioUaGfPZh2minA0XDYJm-U8=.b42cf8cb-65ca-4075-8ca4-68eefe126cf8@github.com> On Thu, 28 Mar 2024 18:41:03 GMT, Paul Sandoz wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix jni includes > > Hamlin, thank you for working on this. I think integrating a sub-set of SLEEF is valuable (not all of it makes sense e.g., DFT part). My recommendation would be to focus on a PR that integrates the required source, rather than taking steps towards that. > > AFAICT from browsing prior comments "integrate the source" appears to be the generally preferred solution, but there is some understandable hesitancy about legal aspects. IIUC from what you say this is a technically feasible and maintainable solution. As said here: > >> We (Oracle Java Platform Group) can handle the required "paperwork > https://github.com/openjdk/jdk/pull/16234#issuecomment-1823335443 Thanks @PaulSandoz for your comment and suggestion. I will work on the solution which integrates the source of sleef into jdk. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2031671597 From jkern at openjdk.org Tue Apr 2 10:49:02 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 10:49:02 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 07:25:30 GMT, Thomas Stuefe wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/os/aix/os_aix.cpp line 1212: > >> 1210: st->print_cr("physical free : " SIZE_FORMAT, (unsigned long)mi.real_free); >> 1211: st->print_cr("swap total : " SIZE_FORMAT, (unsigned long)mi.pgsp_total); >> 1212: st->print_cr("swap free : " SIZE_FORMAT, (unsigned long)mi.pgsp_free); > > A better way to do this would be to change AIX::meminfo to use size_t. We should have done this when introducing that API. Done. modified `os::Aix::meminfo_t` to use `size_t` instead of `long long` > src/hotspot/os/aix/os_aix.cpp line 1399: > >> 1397: os->print("[" PTR_FORMAT " - " PTR_FORMAT "] (" UINTX_FORMAT >> 1398: " bytes, %ld %s pages), %s", >> 1399: (uintptr_t)addr, (uintptr_t)addr + size - 1, size, size / pagesize, describe_pagesize(pagesize), > > p2i Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547603744 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547606275 From jkern at openjdk.org Tue Apr 2 11:26:00 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 11:26:00 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 07:39:06 GMT, Thomas Stuefe wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 62: > >> 60: #include >> 61: >> 62: #if defined(LINUX) || defined(_ALLBSD_SOURCE) || defined(_AIX) > > What else is left? Could we just remove this line altogether now? I cannot answer this question. If this line is now obsolete it was also obsolete before including AIX, because AIX didn't use this file beforehand. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547667349 From jkern at openjdk.org Tue Apr 2 11:26:02 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 11:26:02 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 17:33:29 GMT, Martin Doerr wrote: >> src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 83: >> >>> 81: #error "xlc version not supported, macro __open_xl_version__ not found" >>> 82: #endif >>> 83: #endif // AIX >> >> This `#ifdef _AIX` might be obsolete, because configure will throw a warning if the compiler has a lower version, but it's only a warning. > > I'd prefer having less AIX specific parts in this file. Can this be moved somewhere else? Or maybe combine it with the AIX code above? My question is, do we need this block, because now already configure warns about an outdated compiler, or is a warning to weak and we want to force this error here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547672502 From jkern at openjdk.org Tue Apr 2 11:31:01 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 11:31:01 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 08:06:01 GMT, Thomas Stuefe wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 83: > >> 81: #error "xlc version not supported, macro __open_xl_version__ not found" >> 82: #endif >> 83: #endif // AIX > > Can probably be shortened like this: > > Suggestion: > > #ifdef _AIX > #if !defined(__open_xl_version__) || (__open_xl_version__ < 17) > #error "this xlc version is not supported" > #endif > #endif // AIX followed your proposal. > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 103: > >> 101: #endif >> 102: >> 103: #if !defined(LINUX) && !defined(_ALLBSD_SOURCE) && !defined(_AIX) > > I believe this whole section can be removed now. > > At least I have no idea who this is for. What gcc versions does OpenJDK still support, then, beside these platforms. Also, any gcc platform not on linux or bsd would have hit the #error below at line 132. linux macos and now Aix use this file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547677545 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547681162 From jkern at openjdk.org Tue Apr 2 11:37:59 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 11:37:59 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 11:28:30 GMT, Joachim Kern wrote: >> src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 103: >> >>> 101: #endif >>> 102: >>> 103: #if !defined(LINUX) && !defined(_ALLBSD_SOURCE) && !defined(_AIX) >> >> I believe this whole section can be removed now. >> >> At least I have no idea who this is for. What gcc versions does OpenJDK still support, then, beside these platforms. Also, any gcc platform not on linux or bsd would have hit the #error below at line 132. > > linux macos and now Aix use this file. Who is able to explain if `#if defined(LINUX) || defined(_ALLBSD_SOURCE) || defined(_AIX)` in this file is equivalent to `#if 1` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1547692144 From coleenp at openjdk.org Tue Apr 2 12:18:59 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 12:18:59 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 15:25:48 GMT, Coleen Phillimore wrote: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). Thank you Alex! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18549#issuecomment-2031887624 From coleenp at openjdk.org Tue Apr 2 13:23:16 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 13:23:16 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags Message-ID: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. ------------- Commit messages: - Missed one. - 8236736: Change notproduct JVM flags to develop flags Changes: https://git.openjdk.org/jdk/pull/18541/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8236736 Stats: 239 lines in 39 files changed: 1 ins; 89 del; 149 mod Patch: https://git.openjdk.org/jdk/pull/18541.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18541/head:pull/18541 PR: https://git.openjdk.org/jdk/pull/18541 From jkern at openjdk.org Tue Apr 2 13:38:20 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 13:38:20 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v2] In-Reply-To: References: Message-ID: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: Followed the proposals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18536/files - new: https://git.openjdk.org/jdk/pull/18536/files/61fd0ff2..689b353d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=00-01 Stats: 35 lines in 9 files changed: 0 ins; 4 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From jsjolen at openjdk.org Tue Apr 2 13:46:03 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 2 Apr 2024 13:46:03 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v17] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 19:40:37 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Fixes > - Experiment src/hotspot/share/gc/z/zInitialize.cpp line 50: > 48: > 49: // Early initialization > 50: ZNMT::init(); Change to initialize. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1547918052 From mdoerr at openjdk.org Tue Apr 2 14:51:11 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 2 Apr 2024 14:51:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 11:22:54 GMT, Joachim Kern wrote: >> I'd prefer having less AIX specific parts in this file. Can this be moved somewhere else? Or maybe combine it with the AIX code above? > > My question is, do we need this block, because now already configure warns about an outdated compiler, or is a warning to weak and we want to force this error here? I think that building with xlc 16 is no longer possible because the old build pipeline is no longer supported and that is already caught by configure. So, can we even reach here with older xlc compilers? If not, this code can get removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548043503 From sgibbons at openjdk.org Tue Apr 2 15:14:34 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 2 Apr 2024 15:14:34 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v3] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory`. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with two additional commits since the last revision: - Addressing review comments. - Remove dead code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/c5cb30cc..3aa60a48 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=01-02 Stats: 605 lines in 5 files changed: 16 ins; 567 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From jkern at openjdk.org Tue Apr 2 15:46:11 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 15:46:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 14:48:49 GMT, Martin Doerr wrote: >> My question is, do we need this block, because now already configure warns about an outdated compiler, or is a warning to weak and we want to force this error here? > > I think that building with xlc 16 is no longer possible because the old build pipeline is no longer supported and that is already caught by configure. So, can we even reach here with older xlc compilers? > If not, this code can get removed. Yes, of course you are right. All the compile statements will fail with xlc 16 or older. I will remove it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548134431 From iklam at openjdk.org Tue Apr 2 16:01:12 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 2 Apr 2024 16:01:12 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags In-Reply-To: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Thu, 28 Mar 2024 22:53:22 GMT, Coleen Phillimore wrote: > Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. > > Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. LGTM. For the past 15 years, "notproduct" flags haven't been working as they claim to be in globals.hpp. That doesn't seem to have bothered anyone. This definitely looks like a design that no one needs and should be removed for simplcity. There are some references to "notproduct" in test/hotspot/jtreg/runtime/CommandLine that need to be removed. ------------- Changes requested by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18541#pullrequestreview-1974267126 From duke at openjdk.org Tue Apr 2 16:10:55 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Tue, 2 Apr 2024 16:10:55 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic Message-ID: Performance. Before: Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s Benchmark (isMontBench) Mode Cnt Score Error Units PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s Performance, no intrinsic: Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s Benchmark (isMontBench) Mode Cnt Score Error Units PolynomialP256Bench.benchMultiply true thrpt 3 1919.574 ? 10.591 ops/s Performance, **with intrinsics** Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 10384.591 ? 65.274 ops/s SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 9592.912 ? 236.411 ops/s SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 3479.494 ? 44.578 ops/s SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 3402.147 ? 26.772 ops/s Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 2527.678 ? 64.791 ops/s o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 2541.258 ? 66.634 ops/s Benchmark (isMontBench) Mode Cnt Score Error Units PolynomialP256Bench.benchMultiply true thrpt 3 3021.139 ? 98.289 ops/s Summary on design (see code for 'ASCII art', references and details on math): - Added a new `IntegerPolynomial` field (`MontgomeryIntegerPolynomialP256`) with 52-bit limbs - `getElement(*)/fromMontgomery()` to convert numbers into/out of the field - `ECOperations` is the primary use of the new field - flattened some extra deep nested class hierarchy (also in prep for further other field optimizations) - `forParameters()/multiply()/setSum()` generates numbers in the new field - `ProjectivePoint/Montgomery{Imm|M}utable.asAffine()` to convert out of the new field - Added Fuzz Testing and KAT verified with OpenSSL ------------- Commit messages: - remove trailing whitespace - Remeasure performance - Fix rebase typo - Address comments from Anas and thorough cleanup - conditionalAssign intrinsic - rebase Changes: https://git.openjdk.org/jdk/pull/18583/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329538 Stats: 2335 lines in 34 files changed: 2037 ins; 162 del; 136 mod Patch: https://git.openjdk.org/jdk/pull/18583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18583/head:pull/18583 PR: https://git.openjdk.org/jdk/pull/18583 From jkern at openjdk.org Tue Apr 2 16:14:12 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 2 Apr 2024 16:14:12 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: version check not needed anymore ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18536/files - new: https://git.openjdk.org/jdk/pull/18536/files/689b353d..ac1335e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=01-02 Stats: 6 lines in 1 file changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From coleenp at openjdk.org Tue Apr 2 16:24:19 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 16:24:19 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v2] In-Reply-To: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: > Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. > > Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Clean up notproduct from tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18541/files - new: https://git.openjdk.org/jdk/pull/18541/files/c3d9a1c8..19b8f6b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=00-01 Stats: 37 lines in 4 files changed: 0 ins; 8 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/18541.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18541/head:pull/18541 PR: https://git.openjdk.org/jdk/pull/18541 From iklam at openjdk.org Tue Apr 2 16:24:19 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 2 Apr 2024 16:24:19 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v2] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 16:21:15 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Clean up notproduct from tests. LGTM ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18541#pullrequestreview-1974322553 From coleenp at openjdk.org Tue Apr 2 16:24:20 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 16:24:20 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags In-Reply-To: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Thu, 28 Mar 2024 22:53:22 GMT, Coleen Phillimore wrote: > Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. > > Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. Thank you for pointing out that I missed cleaning up the tests. One failed but I didn't see it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2032498785 From duke at openjdk.org Tue Apr 2 16:28:02 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Tue, 2 Apr 2024 16:28:02 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore [v2] In-Reply-To: References: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Message-ID: <4MeJ7wcsVmkdPoVjTyDbBr_LBxrxBZbTXq3CZwv7_lU=.4d41d88c-5e82-434f-9f34-33d14b3b2cef@github.com> On Thu, 28 Mar 2024 00:04:49 GMT, Sandhya Viswanathan wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix double pasted test > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4871: > >> 4869: vpxor(xtmp3, xtmp2, xtmp4, vec_enc); >> 4870: >> 4871: vblendvps(dst, dst, xtmp1, xtmp3, vec_enc, true, xtmp4); > > The vblendvps at line 4861 could also be emulated: > From: > vpxor(xtmp4, xtmp4, xtmp4, vec_enc); > vcmpps(xtmp3, src, src, Assembler::UNORD_Q, vec_enc); > vblendvps(dst, dst, xtmp4, xtmp3, vec_enc); > > To: > vpxor(xtmp4, xtmp4, xtmp4, vec_enc); > vcmpps(xtmp3, src, src, Assembler::UNORD_Q, vec_enc); > vblendvps(dst, dst, xtmp4, xtmp3, vec_enc, false, xtmp4); Cannot use `xtmp4`, scratch isn't available, (`scratch != src2`): bool scratch_available = scratch != xnoreg && scratch != src1 && scratch != src2 && scratch != dst; (Originally that's exactly what I had, but got caught when I intentionally made the fall-back/default case `assert`) > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3524: > >> 3522: bool blend_emulation = EnableX86ECoreOpts && UseAVX > 1; >> 3523: bool scratch_available = scratch != xnoreg && scratch != src1 && scratch != src2 && scratch != dst; >> 3524: bool dst_available = (dst != mask || compute_mask) && (dst != src1 || dst != src2); > > There are two paths here: > Path 1: When compute_mask == true > scratch_available = (scratch != xnoreg) && (scratch != src1) && (scratch != src2) && (scratch != dst); > dst_available = (dst != mask) && (dst != src1 || dst != src2); > Path 2: When compute_mask == false > scratch_available = (scratch != xnoreg) && (scratch != dst); > dst_available = (dst != mask) && (dst != src1 || dst != src2); I had thought of using `scratch_available` instead of `compute_mask` (i.e. `...dst_available = (dst != mask || scratch_available)...` but I thought it would make more sense to use `compute_mask`. i.e. dst is available to modify on 3531 because of the branch on `compute_mask` on 3526 Also I prefer the shape of condition on 3525; three "simpler" independent boolean conditions instead of two longer ones. (i.e. Making `dst_available` depend on directly on `scratch_available` would make the it two "complex" conditions instead of three "simpler" conditions.) Though simplicity of the condition is debatable, I made it as readable as I could. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18310#discussion_r1548177012 PR Review Comment: https://git.openjdk.org/jdk/pull/18310#discussion_r1548194113 From alanb at openjdk.org Tue Apr 2 16:32:09 2024 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 2 Apr 2024 16:32:09 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 15:42:05 GMT, Volodymyr Paprotski wrote: > Performance. Before: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s > > Performance, no intrinsic: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply true thrpt 3 1919.574 ? 10.591 ops/s > > Performance, **with intrinsics*... src/java.base/share/classes/module-info.java line 265: > 263: jdk.jfr, > 264: jdk.unsupported, > 265: jdk.crypto.ec; jdk.crypto.ec has been hollowed out since JDK 22, the sun.security.ec are in java.base. So I don't think you need this qualified export. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1548199507 From kbarrett at openjdk.org Tue Apr 2 16:43:59 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 16:43:59 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 11:20:49 GMT, Joachim Kern wrote: >> src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 62: >> >>> 60: #include >>> 61: >>> 62: #if defined(LINUX) || defined(_ALLBSD_SOURCE) || defined(_AIX) >> >> What else is left? Could we just remove this line altogether now? > > I cannot answer this question. > If this line is now obsolete it was also obsolete before including AIX, because AIX didn't use this file beforehand. There was at one time an attempt at a gcc/Solaris port, but I think it was never completed, and most vestiges removed. More recently, @TheShermanTanker has been doing stuff to permit clang/Windows, and clang-based builds use this file. I'm kind of surprised he hasn't encountered problems and done some cleanup here. (and ) and 64bit integer types are standard C++ now, so no longer need all this conditionalization. I suggest cleaning that up as a separate precursor. That would eliminate the two !defined blocks entirely. I wish the other conditional includes in this block were "where needed" rather than in globalDefinitions_gcc.hpp, but that's a different mess. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548225495 From kbarrett at openjdk.org Tue Apr 2 16:55:01 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 16:55:01 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 16:41:40 GMT, Kim Barrett wrote: >> I cannot answer this question. >> If this line is now obsolete it was also obsolete before including AIX, because AIX didn't use this file beforehand. > > There was at one time an attempt at a gcc/Solaris port, but I think it was > never completed, and most vestiges removed. More recently, @TheShermanTanker > has been doing stuff to permit clang/Windows, and clang-based builds use this file. > I'm kind of surprised he hasn't encountered problems and done some cleanup here. > > (and ) and 64bit integer types are standard C++ now, > so no longer need all this conditionalization. I suggest cleaning that up as a > separate precursor. That would eliminate the two !defined blocks entirely. I > wish the other conditional includes in this block were "where needed" rather > than in globalDefinitions_gcc.hpp, but that's a different mess. https://bugs.openjdk.org/browse/JDK-8329546 - I can take this if nobody else grabs it soon. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548239737 From stefank at openjdk.org Tue Apr 2 16:59:13 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 2 Apr 2024 16:59:13 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v2] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 16:24:19 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Clean up notproduct from tests. src/hotspot/share/runtime/arguments.cpp line 3420: > 3418: static void apply_debugger_ergo() { > 3419: #ifndef PRODUCT > 3420: // UseDebuggerErgo is notproduct Now that the flag has been changed to a develop flag, it seems wrong that these are guarded by "#ifndef PRODUCT". Shouldn't this be changed to check for ASSERT instead? src/hotspot/share/runtime/flags/jvmFlag.hpp line 118: > 116: EXPERIMENTAL_FLAG_BUT_LOCKED, > 117: DEVELOPER_FLAG_BUT_PRODUCT_BUILD, > 118: NOTPRODUCT_FLAG_BUT_PRODUCT_BUILD Should the ',' on the previous line be removed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18541#discussion_r1548236362 PR Review Comment: https://git.openjdk.org/jdk/pull/18541#discussion_r1548239130 From kbarrett at openjdk.org Tue Apr 2 17:04:11 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 17:04:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 16:52:04 GMT, Kim Barrett wrote: >> There was at one time an attempt at a gcc/Solaris port, but I think it was >> never completed, and most vestiges removed. More recently, @TheShermanTanker >> has been doing stuff to permit clang/Windows, and clang-based builds use this file. >> I'm kind of surprised he hasn't encountered problems and done some cleanup here. >> >> (and ) and 64bit integer types are standard C++ now, >> so no longer need all this conditionalization. I suggest cleaning that up as a >> separate precursor. That would eliminate the two !defined blocks entirely. I >> wish the other conditional includes in this block were "where needed" rather >> than in globalDefinitions_gcc.hpp, but that's a different mess. > > https://bugs.openjdk.org/browse/JDK-8329546 - I can take this if nobody else grabs it soon. I'm waiting for a bunch of tests to complete, so decided to just take that issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548252193 From coleenp at openjdk.org Tue Apr 2 17:16:01 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 17:16:01 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v2] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: <1SplyimCOCzBkP_A15DW-Q_NcUg8d7qmrcNfBU3GJSk=.11aa3fa0-3e3d-4587-922c-ac89d984478d@github.com> On Tue, 2 Apr 2024 16:24:19 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Clean up notproduct from tests. Thanks for looking through the changes, Stefan. ------------- PR Review: https://git.openjdk.org/jdk/pull/18541#pullrequestreview-1974442423 From coleenp at openjdk.org Tue Apr 2 17:16:02 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 17:16:02 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v2] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: <2XgKLmehivQ4frz5mofTSXn9LDFShIwprD6J2GUS_Is=.ecc3c80d-ea3f-447e-a951-9fbbb5c24a59@github.com> On Tue, 2 Apr 2024 16:49:19 GMT, Stefan Karlsson wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean up notproduct from tests. > > src/hotspot/share/runtime/arguments.cpp line 3420: > >> 3418: static void apply_debugger_ergo() { >> 3419: #ifndef PRODUCT >> 3420: // UseDebuggerErgo is notproduct > > Now that the flag has been changed to a develop flag, it seems wrong that these are guarded by "#ifndef PRODUCT". Shouldn't this be changed to check for ASSERT instead? Yes, ifdef ASSERT is more appropriate. > src/hotspot/share/runtime/flags/jvmFlag.hpp line 118: > >> 116: EXPERIMENTAL_FLAG_BUT_LOCKED, >> 117: DEVELOPER_FLAG_BUT_PRODUCT_BUILD, >> 118: NOTPRODUCT_FLAG_BUT_PRODUCT_BUILD > > Should the ',' on the previous line be removed? Yes, I guess our compilers don't complain about that anymore. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18541#discussion_r1548261310 PR Review Comment: https://git.openjdk.org/jdk/pull/18541#discussion_r1548269993 From coleenp at openjdk.org Tue Apr 2 17:25:12 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 17:25:12 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: > Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. > > Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix a couple issues pointed out by Stefank. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18541/files - new: https://git.openjdk.org/jdk/pull/18541/files/19b8f6b6..00a241d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=01-02 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18541.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18541/head:pull/18541 PR: https://git.openjdk.org/jdk/pull/18541 From kvn at openjdk.org Tue Apr 2 17:37:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 2 Apr 2024 17:37:10 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 17:25:12 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix a couple issues pointed out by Stefank. So you finally decided to take on [JDK-8183288](https://bugs.openjdk.org/browse/JDK-8183288) Essentially you are removing "optimized" VM build with these changes. In this case you need to change make files. All Statistics flags should be product (which will increase product VM size) - it is important. May be need build's variable `--enable-jvm-feature-statistcs` to include statistics code on demand. ------------- Changes requested by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18541#pullrequestreview-1974500625 From kvn at openjdk.org Tue Apr 2 17:51:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 2 Apr 2024 17:51:00 GMT Subject: RFR: 8329421: Native methods can not be selectively printed [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 07:23:25 GMT, Volker Simonis wrote: >> Native methods (i.e. "native wrappers") can not be selectively printed with `-XX:CompileCommand=print,class::method`. Currently the only way to print native methods is to use the global `-XX:+PrintAssembly` option. But this prints *all* compiled methods which can be too much if we're just interested in a specific native wrapper. There's no reason to not apply `-XX:CompileCommand` options correctly to native methods as well. > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Add test for -XX:+PrintNativeNMethods Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18567#pullrequestreview-1974527380 From coleenp at openjdk.org Tue Apr 2 18:01:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 18:01:10 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 17:25:12 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix a couple issues pointed out by Stefank. The optimized build still works as before (actually surprised it still builds). Since for a long time the notproduct options acted like develop options, they still do just the same for the optimized build. For optimized, all the develop and notproduct options are materialized. Now just develop, not distinguishing notproduct from that. The same code enabled in PRODUCT is still enabled. I haven't looked at that removing optimized bug in a while. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2032704606 From kvn at openjdk.org Tue Apr 2 18:42:02 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 2 Apr 2024 18:42:02 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 17:25:12 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix a couple issues pointed out by Stefank. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18541#pullrequestreview-1974635400 From kvn at openjdk.org Tue Apr 2 18:42:02 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 2 Apr 2024 18:42:02 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 17:58:47 GMT, Coleen Phillimore wrote: > For optimized, all the develop and notproduct options are materialized. Okay, I see what you did here. You touched only flags declaration and did not `#ifndef PRODUCT` which guards statistics code, for example. Optimized VM build will get that code but will not include DEBUG_ONLY and `#ifdef ASSERT` guarded code. So we still need to be careful when we use `#ifndef PRODUCT` and `#ifdef ASSERT`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2032789810 From coleenp at openjdk.org Tue Apr 2 18:45:01 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 18:45:01 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: <8j655T3fzB8EZT8JdQZMXPgArVeDOvkuY2n0Uwxz1Gk=.e5a2bf28-6ce2-49f2-95cd-f19f92b271df@github.com> On Tue, 2 Apr 2024 17:25:12 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix a couple issues pointed out by Stefank. Yes, I had to remember what optimized did. It gets all the options, but builds with optimization and doesn't turn on asserts. I only removed the notproduct flag distinction since it hasn't been distinct for years and it's confusing what we actually wanted it to do. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2032794955 From kvn at openjdk.org Tue Apr 2 19:02:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 2 Apr 2024 19:02:10 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 17:25:12 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix a couple issues pointed out by Stefank. Long ago, before [JDK-8024545](https://bugs.openjdk.org/browse/JDK-8024545) we did not have not_product flags declared in product build. Only debug flags were declared as constant. We relied on that change since then. That is why you may see the issue with not materialized flags in product build. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2032835463 From kbarrett at openjdk.org Tue Apr 2 19:08:01 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 19:08:01 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: <-Di8CpUXEjVtG3uu2r_djrEuUDpnQF833TWGkDHmZM4=.176b6bd0-e84d-40f1-999e-4fe41c750d8a@github.com> On Tue, 2 Apr 2024 17:25:12 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix a couple issues pointed out by Stefank. Looks good. One minor nit. Also, it seems that develop and nonproduct (before this change) flags have associated JVMFlag objects even in product builds. (The function JVMFlag::is_constant_in_binary is evidence for this. I didn't dig through all the code to verify it.) Probably if one were going to retain nonproduct options and make them work "properly", they would only have JVMFlag objects in non-product builds. But it's not obvious to me why we would want such objects for either category in product builds. I think any change along that line should be a separate followup. test/hotspot/jtreg/runtime/CommandLine/VMOptionWarning.java line 64: > 62: output = new OutputAnalyzer(pb.start()); > 63: output.shouldNotHaveExitValue(0); > 64: output.shouldContain("Error: VM option 'CheckCompressedOops' is develop and is available only in debug version of VM."); Seems like we don't need this test of the develop option CheckCompressedOops at all, since we have the immediately preceding test of the develop option VerifyStack. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18541#pullrequestreview-1974549362 PR Review Comment: https://git.openjdk.org/jdk/pull/18541#discussion_r1548328730 From coleenp at openjdk.org Tue Apr 2 19:16:00 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 19:16:00 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 22:12:49 GMT, Alex Menkov wrote: > PreserveAllAnnotations option causes class file parser to preserve RuntimeInvisibleAnnotations so VM considers them as RuntimeVisibleAnnotations. > For class retransformation JvmtiClassFileReconstituter restores all annotations as RuntimeVisibleAnnotations attributes. > This can cause problem is the class contains only RuntimeInvisibleAnnotations, so corresponding RuntimeVisibleAnnotations attribute name is not present in the class constant pool. > > Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). > > I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). > > The suggested fix adds workaround for this corner case - if "visible" attribute name is not in the CP, the annotations are restored with "invisible" attribute name. > > Testing: > - tier1,tier2,hs-tier5-svc > - all java/lang/instrument tests; > - all RedefineClasses/RetransformClasses tests At one point long ago, I was trying to understand why we have PreserveAllAnnotations and couldn't come up with a reason. For a class file reconstitutor, restoring the invisible annotations to the classfile and then feeding it back to the JVM should have no effect, since the VM doesn't do anything with these annotations. I see now why you get the original assert. I think this looks like a reasonable workaround for this problem. I wonder if we can deprecate PreserveAllAnnotations. Wonder what it's for? It would need a CSR because it's a product flag. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18540#pullrequestreview-1974754738 From duke at openjdk.org Tue Apr 2 19:19:59 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Tue, 2 Apr 2024 19:19:59 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: > Performance. Before: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s > > Performance, no intrinsic: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply true thrpt 3 1919.574 ? 10.591 ops/s > > Performance, **with intrinsics*... Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: remove use of jdk.crypto.ec ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18583/files - new: https://git.openjdk.org/jdk/pull/18583/files/dbe6cd3b..82b6dae7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=00-01 Stats: 6 lines in 2 files changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18583/head:pull/18583 PR: https://git.openjdk.org/jdk/pull/18583 From duke at openjdk.org Tue Apr 2 19:20:00 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Tue, 2 Apr 2024 19:20:00 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 16:29:07 GMT, Alan Bateman wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> remove use of jdk.crypto.ec > > src/java.base/share/classes/module-info.java line 265: > >> 263: jdk.jfr, >> 264: jdk.unsupported, >> 265: jdk.crypto.ec; > > jdk.crypto.ec has been hollowed out since JDK 22, the sun.security.ec are in java.base. So I don't think you need this qualified export. Thanks, fixed. (Started this when `jdk.crypto.ec` was still in use.. missed a few spots during rebase I guess) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1548460157 From coleenp at openjdk.org Tue Apr 2 19:29:09 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 19:29:09 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v3] In-Reply-To: <-Di8CpUXEjVtG3uu2r_djrEuUDpnQF833TWGkDHmZM4=.176b6bd0-e84d-40f1-999e-4fe41c750d8a@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> <-Di8CpUXEjVtG3uu2r_djrEuUDpnQF833TWGkDHmZM4=.176b6bd0-e84d-40f1-999e-4fe41c750d8a@github.com> Message-ID: On Tue, 2 Apr 2024 17:58:16 GMT, Kim Barrett wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix a couple issues pointed out by Stefank. > > test/hotspot/jtreg/runtime/CommandLine/VMOptionWarning.java line 64: > >> 62: output = new OutputAnalyzer(pb.start()); >> 63: output.shouldNotHaveExitValue(0); >> 64: output.shouldContain("Error: VM option 'CheckCompressedOops' is develop and is available only in debug version of VM."); > > Seems like we don't need this test of the develop option CheckCompressedOops at all, since we have > the immediately preceding test of the develop option VerifyStack. You're right, we've already tested this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18541#discussion_r1548488160 From coleenp at openjdk.org Tue Apr 2 19:47:23 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 19:47:23 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v4] In-Reply-To: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: > Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. > > Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove redundant test case. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18541/files - new: https://git.openjdk.org/jdk/pull/18541/files/00a241d3..3b5002d8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18541&range=02-03 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18541.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18541/head:pull/18541 PR: https://git.openjdk.org/jdk/pull/18541 From coleenp at openjdk.org Tue Apr 2 19:54:01 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 2 Apr 2024 19:54:01 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v4] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: <8dFwjhziEAnv-PG3YM6FuPLbB6HlyO4BApqmuLwc3xo=.cf4ae09a-cf37-4f30-bcad-1f0582de82d1@github.com> On Tue, 2 Apr 2024 19:47:23 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove redundant test case. Thanks for reviewing, Kim. Is your suggestion to not have a JVMFlag object for develop flags in PRODUCT builds? Presumably to save some footprint? I'm not sure we would win fighting the macros to accomplish this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2032981431 From simonis at openjdk.org Tue Apr 2 19:55:03 2024 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 2 Apr 2024 19:55:03 GMT Subject: Integrated: 8329421: Native methods can not be selectively printed In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 19:20:53 GMT, Volker Simonis wrote: > Native methods (i.e. "native wrappers") can not be selectively printed with `-XX:CompileCommand=print,class::method`. Currently the only way to print native methods is to use the global `-XX:+PrintAssembly` option. But this prints *all* compiled methods which can be too much if we're just interested in a specific native wrapper. There's no reason to not apply `-XX:CompileCommand` options correctly to native methods as well. This pull request has now been integrated. Changeset: 3057dded Author: Volker Simonis URL: https://git.openjdk.org/jdk/commit/3057dded4878b0110bc2c09b52019570a0a31c9f Stats: 103 lines in 2 files changed: 80 ins; 13 del; 10 mod 8329421: Native methods can not be selectively printed Reviewed-by: kvn ------------- PR: https://git.openjdk.org/jdk/pull/18567 From kbarrett at openjdk.org Tue Apr 2 20:14:10 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 20:14:10 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 17:01:07 GMT, Kim Barrett wrote: >> https://bugs.openjdk.org/browse/JDK-8329546 - I can take this if nobody else grabs it soon. > > I'm waiting for a bunch of tests to complete, so decided to just take that issue. https://github.com/openjdk/jdk/pull/18586 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548549705 From kbarrett at openjdk.org Tue Apr 2 20:14:10 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 20:14:10 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 11:35:44 GMT, Joachim Kern wrote: >> linux macos and now Aix use this file. > > Who is able to explain if > `#if defined(LINUX) || defined(_ALLBSD_SOURCE) || defined(_AIX)` > in this file is equivalent to > `#if 1` See my other comments and https://bugs.openjdk.org/browse/JDK-8329546 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548550923 From kbarrett at openjdk.org Tue Apr 2 20:14:22 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 20:14:22 GMT Subject: RFR: 8329546: Assume sized integral types are available Message-ID: Please review this change that cleans up the inclusion of and when using gcc/clang as the compiler. Testing: mach5 tier1 ------------- Commit messages: - include std headers unconditionally and cleanup Changes: https://git.openjdk.org/jdk/pull/18586/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18586&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329546 Stats: 26 lines in 1 file changed: 2 ins; 23 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18586.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18586/head:pull/18586 PR: https://git.openjdk.org/jdk/pull/18586 From kbarrett at openjdk.org Tue Apr 2 20:21:11 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 2 Apr 2024 20:21:11 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v4] In-Reply-To: <8dFwjhziEAnv-PG3YM6FuPLbB6HlyO4BApqmuLwc3xo=.cf4ae09a-cf37-4f30-bcad-1f0582de82d1@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> <8dFwjhziEAnv-PG3YM6FuPLbB6HlyO4BApqmuLwc3xo=.cf4ae09a-cf37-4f30-bcad-1f0582de82d1@github.com> Message-ID: On Tue, 2 Apr 2024 19:51:03 GMT, Coleen Phillimore wrote: > Thanks for reviewing, Kim. Is your suggestion to not have a JVMFlag object for develop flags in PRODUCT builds? Presumably to save some footprint? I'm not sure we would win fighting the macros to accomplish this. Yes, that's the suggestion and the rationale for it. It should also remove the need for is_constant_in_binary. I don't know how hard it would actually be to accomplish this. I agree it might not be worth the effort, but we won't know until someone looks, which I haven't done. It might even be easy. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2033020865 From iklam at openjdk.org Tue Apr 2 20:35:10 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 2 Apr 2024 20:35:10 GMT Subject: RFR: 8329546: Assume sized integral types are available In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 20:09:51 GMT, Kim Barrett wrote: > Please review this change that cleans up the inclusion of and > when using gcc/clang as the compiler. > > Testing: mach5 tier1 Looks reasonable. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18586#pullrequestreview-1974968517 From sspitsyn at openjdk.org Tue Apr 2 20:38:08 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 Apr 2024 20:38:08 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: References: Message-ID: <42DQj6IN0P_snFDH2haNpi-IxQVlx1tkL-i4IRIsVi0=.acb160f2-d26e-4c3c-a46a-620dd4a8c728@github.com> On Thu, 28 Mar 2024 22:12:49 GMT, Alex Menkov wrote: > PreserveAllAnnotations option causes class file parser to preserve RuntimeInvisibleAnnotations so VM considers them as RuntimeVisibleAnnotations. > For class retransformation JvmtiClassFileReconstituter restores all annotations as RuntimeVisibleAnnotations attributes. > This can cause problem is the class contains only RuntimeInvisibleAnnotations, so corresponding RuntimeVisibleAnnotations attribute name is not present in the class constant pool. > > Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). > > I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). > > The suggested fix adds workaround for this corner case - if "visible" attribute name is not in the CP, the annotations are restored with "invisible" attribute name. > > Testing: > - tier1,tier2,hs-tier5-svc > - all java/lang/instrument tests; > - all RedefineClasses/RetransformClasses tests Looks good. Sorry for delay. I thought I've already approved it. :( ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18540#pullrequestreview-1974973529 From iklam at openjdk.org Tue Apr 2 22:08:10 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 2 Apr 2024 22:08:10 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v4] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> <8dFwjhziEAnv-PG3YM6FuPLbB6HlyO4BApqmuLwc3xo=.cf4ae09a-cf37-4f30-bcad-1f0582de82d1@github.com> Message-ID: On Tue, 2 Apr 2024 20:18:34 GMT, Kim Barrett wrote: > > Thanks for reviewing, Kim. Is your suggestion to not have a JVMFlag object for develop flags in PRODUCT builds? Presumably to save some footprint? I'm not sure we would win fighting the macros to accomplish this. > > Yes, that's the suggestion and the rationale for it. It should also remove the need for is_constant_in_binary. I don't know how hard it would actually be to accomplish this. I agree it might not be worth the effort, but we won't know until someone looks, which I haven't done. It might even be easy. Currently the VM prints an error message for non-product flags, so we need to keep some information about them. We can probably skip the type information, etc, to save a little space, but the space saving would be minimal. $ java -XX:+LoomDeoptAfterThaw --version Error: VM option 'LoomDeoptAfterThaw' is develop and is available only in debug version of VM. Improperly specified VM option 'LoomDeoptAfterThaw' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2033183718 From pchilanomate at openjdk.org Tue Apr 2 22:24:08 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 2 Apr 2024 22:24:08 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake In-Reply-To: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Tue, 2 Apr 2024 08:13:20 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Looks good to me. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1986: > 1984: jvmtiError err = JvmtiEnvBase::get_threadOop_and_JavaThread(tlh.list(), target, &java_thread, &thread_obj); > 1985: if (err != JVMTI_ERROR_NONE) { > 1986: printf("DBG: JvmtiHandshake::execute: err: %d\n", (int)err); fflush(0); Any reason why not use UL instead with jvmti tag? ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18574#pullrequestreview-1975169550 PR Review Comment: https://git.openjdk.org/jdk/pull/18574#discussion_r1548686478 From pchilanomate at openjdk.org Tue Apr 2 22:48:03 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 2 Apr 2024 22:48:03 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake In-Reply-To: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: On Tue, 2 Apr 2024 00:22:28 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > > Ran mach5 tiers 1-6 Looks good to me. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18570#pullrequestreview-1975197001 From duke at openjdk.org Tue Apr 2 23:12:34 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Tue, 2 Apr 2024 23:12:34 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore [v3] In-Reply-To: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> References: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Message-ID: > Replace vpblendvp[sd] with macro assembler call and test in: > - `C2_MacroAssembler::vector_cast_float_to_int_special_cases_avx` (insufficient registers for 1 of 2 blends) > - `C2_MacroAssembler::vector_cast_double_to_int_special_cases_avx` > - `C2_MacroAssembler::vector_count_leading_zeros_int_avx` > > Functional testing with existing and new tests: > `make test TEST="test/hotspot/jtreg/compiler/vectorapi/reshape test/hotspot/jtreg/compiler/vectorization/runner/BasicIntOpTest.java"` > > Benchmarking with existing and new tests: > > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microFloat256ToInteger256" > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microDouble256ToInteger256" > make test TEST="micro:org.openjdk.bench.vm.compiler.VectorBitCount.WithSuperword.intLeadingZeroCount" > > > Performance before: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 17271.078 ? 184.140 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9310.507 ? 88.136 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11137.594 ? 19.009 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 5425.001 ? 3.136 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 0.994 ? 0.002 ops/us > > > Performance after: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 19222.048 ? 87.622 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9233.245 ? 123.493 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11672.806 ? 10.854 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 6009.735 ? 12.173 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 1.039 ? 0.004 ops/us Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: Allow scratch to overlap with src1|src2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18310/files - new: https://git.openjdk.org/jdk/pull/18310/files/9430d88e..1705a6aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=01-02 Stats: 25 lines in 2 files changed: 14 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/18310.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18310/head:pull/18310 PR: https://git.openjdk.org/jdk/pull/18310 From sspitsyn at openjdk.org Tue Apr 2 23:52:33 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 Apr 2024 23:52:33 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v2] In-Reply-To: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: cleanup - removed temporary logging used for debugging ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18574/files - new: https://git.openjdk.org/jdk/pull/18574/files/86cb34e7..8f048d34 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18574&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18574&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18574/head:pull/18574 PR: https://git.openjdk.org/jdk/pull/18574 From sspitsyn at openjdk.org Wed Apr 3 00:00:12 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 Apr 2024 00:00:12 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v2] In-Reply-To: References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Tue, 2 Apr 2024 22:20:12 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: cleanup - removed temporary logging used for debugging > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1986: > >> 1984: jvmtiError err = JvmtiEnvBase::get_threadOop_and_JavaThread(tlh.list(), target, &java_thread, &thread_obj); >> 1985: if (err != JVMTI_ERROR_NONE) { >> 1986: printf("DBG: JvmtiHandshake::execute: err: %d\n", (int)err); fflush(0); > > Any reason why not use UL instead with jvmti tag? Nice catch. The `printf` was temporarily used for debugging. Removed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18574#discussion_r1548752419 From sspitsyn at openjdk.org Wed Apr 3 00:00:11 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 Apr 2024 00:00:11 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v2] In-Reply-To: References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Tue, 2 Apr 2024 23:52:33 GMT, Serguei Spitsyn wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: cleanup - removed temporary logging used for debugging Patricio, thank you for prompt review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18574#issuecomment-2033297704 From amenkov at openjdk.org Wed Apr 3 00:26:13 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 3 Apr 2024 00:26:13 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 19:13:15 GMT, Coleen Phillimore wrote: > At one point long ago, I was trying to understand why we have PreserveAllAnnotations and couldn't come up with a reason. For a class file reconstitutor, restoring the invisible annotations to the classfile and then feeding it back to the JVM should have no effect, since the VM doesn't do anything with these annotations. > > I see now why you get the original assert. I think this looks like a reasonable workaround for this problem. > > I wonder if we can deprecate PreserveAllAnnotations. Wonder what it's for? It would need a CSR because it's a product flag. I also was not able to identify purpose of the flag. I found couple PRs from 2021 about the flag: #4245 and #4280 with a description of possible usecase for it, but it does not look as a good reason to me. I'll create a CR to deprecate the flag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18540#issuecomment-2033319487 From amenkov at openjdk.org Wed Apr 3 00:26:13 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 3 Apr 2024 00:26:13 GMT Subject: Integrated: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 22:12:49 GMT, Alex Menkov wrote: > PreserveAllAnnotations option causes class file parser to preserve RuntimeInvisibleAnnotations so VM considers them as RuntimeVisibleAnnotations. > For class retransformation JvmtiClassFileReconstituter restores all annotations as RuntimeVisibleAnnotations attributes. > This can cause problem is the class contains only RuntimeInvisibleAnnotations, so corresponding RuntimeVisibleAnnotations attribute name is not present in the class constant pool. > > Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). > > I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). > > The suggested fix adds workaround for this corner case - if "visible" attribute name is not in the CP, the annotations are restored with "invisible" attribute name. > > Testing: > - tier1,tier2,hs-tier5-svc > - all java/lang/instrument tests; > - all RedefineClasses/RetransformClasses tests This pull request has now been integrated. Changeset: f88f31dc Author: Alex Menkov URL: https://git.openjdk.org/jdk/commit/f88f31dcbf80e9a4cd3ba9d34be8b88128af97c6 Stats: 33 lines in 3 files changed: 22 ins; 0 del; 11 mod 8328137: PreserveAllAnnotations can cause failure of class retransformation Reviewed-by: coleenp, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/18540 From jwaters at openjdk.org Wed Apr 3 02:31:11 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 3 Apr 2024 02:31:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: <7xb5qfR2poiQyEhUY5A09POeM2EpYBVRQKNs4XX_nuM=.7ed52b15-8c69-42c2-8625-cd0fe2ada78b@github.com> On Tue, 2 Apr 2024 20:10:12 GMT, Kim Barrett wrote: >> I'm waiting for a bunch of tests to complete, so decided to just take that issue. > > https://github.com/openjdk/jdk/pull/18586 @kimbarrett I've been doing things to permit gcc/Windows, not clang. clang has too many different distributions on Windows for me to settle on one, and generalising all of them to be able to compile with any of the Windows clang distributions seamlessly and without issues sounds like a nightmare :P gcc on the other hand has just 2: MSYS2 MINGW64 with ucrt (Which is the one I'm working on) and standalone gcc Windows builds that link to ucrt I haven't sent a cleanup in this area because I thought my changes were too specific to gcc/Windows, and wouldn't help much with HotSpot in general. I've learnt from my mistakes in the past where I caused reviewers pain in reviewing my admittedly selfish changes to HotSpot :( That said, if it is requested of me, I can commit some cleanups to this file. What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548833671 From jwaters at openjdk.org Wed Apr 3 02:34:08 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 3 Apr 2024 02:34:08 GMT Subject: RFR: 8329546: Assume sized integral types are available In-Reply-To: References: Message-ID: <7ezYD-RPv4D3MrDtesWljyoJqhwtlfp4OP-fCHrUOEg=.77711ab1-1482-49e0-a352-e18c36da7505@github.com> On Tue, 2 Apr 2024 20:09:51 GMT, Kim Barrett wrote: > Please review this change that cleans up the inclusion of and > when using gcc/clang as the compiler. > > Testing: mach5 tier1 Thanks for this! This helps the Windows/gcc port a lot ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/18586#pullrequestreview-1975375897 From sspitsyn at openjdk.org Wed Apr 3 02:45:13 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 Apr 2024 02:45:13 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 15:25:48 GMT, Coleen Phillimore wrote: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). src/hotspot/share/oops/instanceKlass.cpp line 2277: > 2275: jmethodID InstanceKlass::get_jmethod_id(const methodHandle& method_h) { > 2276: Method* method = method_h(); > 2277: int idnum = method_h->method_idnum(); Nit: Can use `method` instead of `method_h()`. src/hotspot/share/oops/instanceKlass.cpp line 2335: > 2333: jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); > 2334: Atomic::release_store(&jmeths[idnum+1], new_id); > 2335: return new_id; Nit: It feels like the function `InstanceKlass::get_jmethod_id()` can be more simplified with a small restructuring: jmethodID update_jmethod_id(jmethodID* jmeths, Method* method, int idnum) { // method_with_idnum if (method->is_old() && !method->is_obsolete()) { // The method passed in is old (but not obsolete), we need to use the current version method = method_with_idnum((int)idnum); assert(method != nullptr, "old and but not obsolete, so should exist"); } jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); Atomic::release_store(&jmeths[idnum+1], new_id); return new_id; } jmethodID InstanceKlass::get_jmethod_id(const methodHandle& method_h) { Method* method = method_h(); int idnum = method_h->method_idnum(); jmethodID* jmeths = methods_jmethod_ids_acquire(); <... big comment ...> MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); if (jmeths == nullptr) { jmeths = methods_jmethod_ids_acquire(); if (jmeths == nullptr) { // Still null? size_t size = idnum_allocated_count(); assert(size > (size_t)idnum, "should already have space"); jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); memset(jmeths, 0, (size+1)*sizeof(jmethodID)); // cache size is stored in element[0], other elements offset by one jmeths[0] = (jmethodID)size; jmethodID new_id = update_jmethod_id(jmeths, method, idnum); // publish jmeths release_set_methods_jmethod_ids(jmeths); return new_id; } } jmethodID id = Atomic::load_acquire(&jmeths[idnum+1]); if (id == nullptr) { id = jmeths[idnum+1]; if (id == nullptr) { // Still null? jmethodID new_id = update_jmethod_id(jmeths, method, idnum); return new_id; } } return id; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1548840596 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1548839930 From vlivanov at openjdk.org Wed Apr 3 02:58:00 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 3 Apr 2024 02:58:00 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: Message-ID: <86hdErNCggxb7O-j9AYmcR9IV7M15p1Hnrowo4nDk_U=.1b2b07cd-209b-4181-bc97-58d1a8fac674@github.com> On Mon, 1 Apr 2024 21:07:31 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Removed not_used state of nmethod Nice cleanup! Overall, looks very good. What about `CompiledMethod_lock`? There's no `CompiledMethod` anymore, but the lock name still refers to it. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18554#pullrequestreview-1975392018 From kbarrett at openjdk.org Wed Apr 3 03:39:10 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 3 Apr 2024 03:39:10 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <7xb5qfR2poiQyEhUY5A09POeM2EpYBVRQKNs4XX_nuM=.7ed52b15-8c69-42c2-8625-cd0fe2ada78b@github.com> References: <7xb5qfR2poiQyEhUY5A09POeM2EpYBVRQKNs4XX_nuM=.7ed52b15-8c69-42c2-8625-cd0fe2ada78b@github.com> Message-ID: <_rlVcRu9Omy5cqIe8UveAchgM8o5XT6rU2zS8ruUouE=.b944dc89-33ce-462e-89d4-880a1d267dba@github.com> On Wed, 3 Apr 2024 02:28:08 GMT, Julian Waters wrote: >> https://github.com/openjdk/jdk/pull/18586 > > @kimbarrett I've been doing things to permit gcc/Windows, not clang. clang has too many different distributions on Windows for me to settle on one, and generalising all of them to be able to compile with any of the Windows clang distributions seamlessly and without issues sounds like a nightmare :P gcc on the other hand has just 2: MSYS2 MINGW64 with ucrt (Which is the one I'm working on) and standalone gcc Windows builds that link to ucrt > > I haven't sent a cleanup in this area because I thought my changes were too specific to gcc/Windows, and wouldn't help much with HotSpot in general. I've learnt from my mistakes in the past where I caused reviewers pain in reviewing my admittedly selfish changes to HotSpot :( > That said, if it is requested of me, I can commit some cleanups to this file. What do you think? @TheShermanTanker It depends on the details, of course. I think there are lots of possible cleanups in this vicinity that have little or nothing to do with gcc/Windows specifically, though might help there too. And yeah, I misremembered that it was not clang/Windows but rather gcc/Windows you were working on. But the same points largely apply here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1548868243 From kbarrett at openjdk.org Wed Apr 3 03:42:08 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 3 Apr 2024 03:42:08 GMT Subject: RFR: 8329546: Assume sized integral types are available In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 20:09:51 GMT, Kim Barrett wrote: > Please review this change that cleans up the inclusion of and > when using gcc/clang as the compiler. > > Testing: mach5 tier1 For what it's worth, I think once globalDefinitions_xlc.hpp is gone (https://github.com/openjdk/jdk/pull/18536) that there is some more tidying up to be done. I didn't want to go so far as touching that file when it's about to vanish, and also didn't want to make changes where it was left inconsistent. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18586#issuecomment-2033471223 From thartmann at openjdk.org Wed Apr 3 05:24:11 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 3 Apr 2024 05:24:11 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v9] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Thu, 7 Mar 2024 10:58:19 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - minor updates according to review comments > - minor update: align_up for ReservedCodeCacheSize > - one another cleanup round > - minor update. removed helper function as it caused many comments in the review > - set_size_of_unset_code_heap > - minor update > - apply suggestions > - cleanup & test udpdate > - 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments Looks good to me otherwise. I'll run some testing and report back once it passed. src/hotspot/share/code/codeCache.cpp line 180: > 178: GrowableArray* CodeCache::_allocable_heaps = new(mtCode) GrowableArray (static_cast(CodeBlobType::All), mtCode); > 179: > 180: static void check_min_size(const char *codeheap, size_t size, size_t required_size) { Suggestion: static void check_min_size(const char* codeheap, size_t size, size_t required_size) { src/hotspot/share/code/codeCache.cpp line 183: > 181: if (size < required_size) { > 182: log_debug(codecache)("Code heap (%s) size " SIZE_FORMAT " below required minimal size " SIZE_FORMAT, > 183: codeheap, size, required_size); Should `size` and `required_size` be printed in `K`? src/hotspot/share/code/codeCache.cpp line 198: > 196: static void set_size_of_unset_code_heap(CodeHeapInfo* heap, size_t available_size, size_t used_size, size_t min_size) { > 197: assert(!heap->set, "sanity"); > 198: heap->size = (available_size > used_size + min_size) ? (available_size - used_size) : min_size; Suggestion: heap->size = (available_size > (used_size + min_size)) ? (available_size - used_size) : min_size; src/hotspot/share/code/codeCache.cpp line 256: > 254: if (total != cache_size && !cache_size_set) { > 255: log_info(codecache)("ReservedCodeCache size %lld changed to total segments size NonNMethod %lld NonProfiled %lld Profiled %lld = %lld", > 256: (long long) cache_size, (long long) non_nmethod.size, (long long) non_profiled.size, Any reason you are using `%lld` here and below instead of `SIZE_FORMAT`? src/hotspot/share/memory/virtualspace.cpp line 326: > 324: ReservedSpace ReservedSpace::partition(size_t offset, size_t partition_size, size_t alignment) { > 325: assert(offset + partition_size <= size(), "partition failed"); > 326: ReservedSpace result(base()+offset, partition_size, alignment, page_size(), special(), executable()); Suggestion: ReservedSpace result(base() + offset, partition_size, alignment, page_size(), special(), executable()); ------------- Changes requested by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17244#pullrequestreview-1975505901 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1548916881 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1548917691 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1548919062 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1548916641 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1548923421 From duke at openjdk.org Wed Apr 3 07:07:15 2024 From: duke at openjdk.org (=?UTF-8?B?VG9tw6HFoQ==?= Zezula) Date: Wed, 3 Apr 2024 07:07:15 GMT Subject: RFR: JDK-8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. Message-ID: Problem: The debugging stack traces in the `jdk.internal.vm.TranslatedException` do not work in the libjvmci compiler becuase they are enabled using the system property `jdk.internal.vm.TranslatedException.debug`. However, the libjvmci compiler does not copy HotSpot VM system properties. Instead, the HotSpot system properties are copied only into properties returned by `Services::getSavedProperties()`. Fix: A debug boolean flag is passed to the `VMSupport::decodeAndThrowThrowable()` method. ------------- Commit messages: - JDK-8329564: jdk.internal.vm.TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. Changes: https://git.openjdk.org/jdk/pull/18591/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18591&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329564 Stats: 25 lines in 4 files changed: 8 ins; 1 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/18591.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18591/head:pull/18591 PR: https://git.openjdk.org/jdk/pull/18591 From stefank at openjdk.org Wed Apr 3 09:31:24 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Apr 2024 09:31:24 GMT Subject: RFR: 8328698: oopDesc::klass_raw() decodes without a null check Message-ID: The oopDesc::klass_raw() function is used when the caller wants to skip asserts. Unfortunately, it skips the the check to see if the narrow klass is zero, which could lead to an incorrect Klass* being returned. This patch fixes this. In addition to this, I'm trying to make the code a bit clearer, so the patch also contains changes for the following: * The word raw has various different meaning in the context of oops and klasses. So, what does it mean in this context? Does it mean "read the klass pointer value without decoding it"? Or does it mean "decode the klass pointer value without any asserts"? I would like to propose that we use a name that describes that this function is used to skip performing various asserts. * I replaced the one usage of load_klass_raw with a call to klass_raw() instead. * I restructured the `is_oop_safe` so that we perform the null-check first. Note that `oopDesc::is_oop` performs its own verification of the klass pointer, so if we want extra klass verification in `is_oop_safe` we need to do it before calling the `is_oop` check. * I also renamed the _raw functions inside the CompressedKlassPointers klass and moved private functions. Tell me if you think some of these should be split up into separate RFEs. Tested with tier1-3. ------------- Commit messages: - 8328698: oopDesc::klass_raw() decodes without a null check Changes: https://git.openjdk.org/jdk/pull/18597/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18597&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328698 Stats: 93 lines in 10 files changed: 42 ins; 34 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18597.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18597/head:pull/18597 PR: https://git.openjdk.org/jdk/pull/18597 From ihse at openjdk.org Wed Apr 3 09:49:01 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 3 Apr 2024 09:49:01 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 19:19:59 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s >> >> Performance, no intrinsic: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57... > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > remove use of jdk.crypto.ec Build changes are trivially fine. ------------- Marked as reviewed by ihse (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18583#pullrequestreview-1976208258 From ihse at openjdk.org Wed Apr 3 09:52:09 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 3 Apr 2024 09:52:09 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 16:14:12 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > version check not needed anymore The build change look trivially fine. And allow me to show my appreciation for the hotspot code cleanup! (But note that this is not a review of that part). ------------- PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1976222691 From mli at openjdk.org Wed Apr 3 10:30:29 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 3 Apr 2024 10:30:29 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv Message-ID: Hi, Can you help to review this patch to update vm flags related to riscv profile? Thanks Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. ------------- Commit messages: - Initial commit Changes: https://git.openjdk.org/jdk/pull/18599/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18599&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329083 Stats: 98 lines in 3 files changed: 63 ins; 30 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18599.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18599/head:pull/18599 PR: https://git.openjdk.org/jdk/pull/18599 From coleenp at openjdk.org Wed Apr 3 12:25:15 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 12:25:15 GMT Subject: RFR: 8236736: Change notproduct JVM flags to develop flags [v4] In-Reply-To: References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Tue, 2 Apr 2024 19:47:23 GMT, Coleen Phillimore wrote: >> Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. >> >> Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove redundant test case. Thanks for the reviews, Ioi, Vladimir, Kim and Stefan. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18541#issuecomment-2034433313 From coleenp at openjdk.org Wed Apr 3 12:25:15 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 12:25:15 GMT Subject: Integrated: 8236736: Change notproduct JVM flags to develop flags In-Reply-To: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> References: <25c1XDQrzxvG0AuxlRjQyznnTdLzD1-J4kebuXzj-Zc=.0f5d28e1-7672-40a9-97fe-04a77fda65d9@github.com> Message-ID: On Thu, 28 Mar 2024 22:53:22 GMT, Coleen Phillimore wrote: > Remove the notproduct distinction for command line options, rather than trying to wrestle the macros to fix the bug that they've been treated as develop options for some time now. This simplifies the command line option macros. > > Tested with tier1-4, tier1 on Oracle platforms. Also built shenandoah. This pull request has now been integrated. Changeset: bea493bc Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/bea493bcb86370dc3fb00d86c545f01fc614e000 Stats: 282 lines in 43 files changed: 1 ins; 102 del; 179 mod 8236736: Change notproduct JVM flags to develop flags Reviewed-by: iklam, kvn, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/18541 From coleenp at openjdk.org Wed Apr 3 12:45:09 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 12:45:09 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 02:41:06 GMT, Serguei Spitsyn wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > src/hotspot/share/oops/instanceKlass.cpp line 2335: > >> 2333: jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); >> 2334: Atomic::release_store(&jmeths[idnum+1], new_id); >> 2335: return new_id; > > Nit: It feels like the function `InstanceKlass::get_jmethod_id()` can be more simplified with a small restructuring: > > jmethodID update_jmethod_id(jmethodID* jmeths, Method* method, int idnum) { > // method_with_idnum > if (method->is_old() && !method->is_obsolete()) { > // The method passed in is old (but not obsolete), we need to use the current version > method = method_with_idnum((int)idnum); > assert(method != nullptr, "old and but not obsolete, so should exist"); > } > jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); > Atomic::release_store(&jmeths[idnum+1], new_id); > return new_id; > } > > jmethodID InstanceKlass::get_jmethod_id(const methodHandle& method_h) { > Method* method = method_h(); > int idnum = method_h->method_idnum(); > jmethodID* jmeths = methods_jmethod_ids_acquire(); > > <... big comment ...> > MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); > if (jmeths == nullptr) { > jmeths = methods_jmethod_ids_acquire(); > > if (jmeths == nullptr) { // Still null? > size_t size = idnum_allocated_count(); > assert(size > (size_t)idnum, "should already have space"); > jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); > memset(jmeths, 0, (size+1)*sizeof(jmethodID)); > // cache size is stored in element[0], other elements offset by one > jmeths[0] = (jmethodID)size; > jmethodID new_id = update_jmethod_id(jmeths, method, idnum); > > // publish jmeths > release_set_methods_jmethod_ids(jmeths); > return new_id; > } > } > jmethodID id = Atomic::load_acquire(&jmeths[idnum+1]); > if (id == nullptr) { > id = jmeths[idnum+1]; > > if (id == nullptr) { // Still null? > jmethodID new_id = update_jmethod_id(jmeths, method, idnum); > return new_id; > } > } > return id; > } Yes this refactoring looks nice. Nice to have only one place that checks for !is_obsolete. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1549664452 From bulasevich at openjdk.org Wed Apr 3 13:11:29 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 3 Apr 2024 13:11:29 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v10] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has updated the pull request incrementally with three additional commits since the last revision: - Update src/hotspot/share/memory/virtualspace.cpp Co-authored-by: Tobias Hartmann - Update src/hotspot/share/code/codeCache.cpp Co-authored-by: Tobias Hartmann - Update src/hotspot/share/code/codeCache.cpp style fix Co-authored-by: Tobias Hartmann ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/b39dffdf..83c7aeea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=08-09 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From bulasevich at openjdk.org Wed Apr 3 13:11:29 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 3 Apr 2024 13:11:29 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v9] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: <6wujKVTAn60DEQSLbFxFR1zjz7CdVCfn2kpvUxGMw3A=.56ab674d-e981-4bbb-806e-905bce1dcb82@github.com> On Wed, 3 Apr 2024 05:06:16 GMT, Tobias Hartmann wrote: >> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - minor updates according to review comments >> - minor update: align_up for ReservedCodeCacheSize >> - one another cleanup round >> - minor update. removed helper function as it caused many comments in the review >> - set_size_of_unset_code_heap >> - minor update >> - apply suggestions >> - cleanup & test udpdate >> - 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments > > src/hotspot/share/code/codeCache.cpp line 183: > >> 181: if (size < required_size) { >> 182: log_debug(codecache)("Code heap (%s) size " SIZE_FORMAT " below required minimal size " SIZE_FORMAT, >> 183: codeheap, size, required_size); > > Should `size` and `required_size` be printed in `K`? Yes. Thanks! > src/hotspot/share/code/codeCache.cpp line 256: > >> 254: if (total != cache_size && !cache_size_set) { >> 255: log_info(codecache)("ReservedCodeCache size %lld changed to total segments size NonNMethod %lld NonProfiled %lld Profiled %lld = %lld", >> 256: (long long) cache_size, (long long) non_nmethod.size, (long long) non_profiled.size, > > Any reason you are using `%lld` here and below instead of `SIZE_FORMAT`? Right. SIZE_FORMAT is better. Thank you! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1549708063 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1549705763 From bulasevich at openjdk.org Wed Apr 3 13:19:23 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 3 Apr 2024 13:19:23 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v11] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: <5doaM2HyM1bnpCiY9PIEU4idFoNz03GIRgVYMQ0M0JE=.c5455a15-a86a-4b43-b876-53175a6c4d85@github.com> > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: log messages: use SIZE_FORMAT, printed in K ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/83c7aeea..6549765f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=09-10 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From coleenp at openjdk.org Wed Apr 3 13:25:36 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 13:25:36 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: References: Message-ID: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Refactoring suggested by Serguei. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18549/files - new: https://git.openjdk.org/jdk/pull/18549/files/ffffe38f..6576d14d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=00-01 Stats: 32 lines in 2 files changed: 12 ins; 16 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18549/head:pull/18549 PR: https://git.openjdk.org/jdk/pull/18549 From coleenp at openjdk.org Wed Apr 3 13:25:36 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 13:25:36 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 12:42:30 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 2335: >> >>> 2333: jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); >>> 2334: Atomic::release_store(&jmeths[idnum+1], new_id); >>> 2335: return new_id; >> >> Nit: It feels like the function `InstanceKlass::get_jmethod_id()` can be more simplified with a small restructuring: >> >> jmethodID update_jmethod_id(jmethodID* jmeths, Method* method, int idnum) { >> // method_with_idnum >> if (method->is_old() && !method->is_obsolete()) { >> // The method passed in is old (but not obsolete), we need to use the current version >> method = method_with_idnum((int)idnum); >> assert(method != nullptr, "old and but not obsolete, so should exist"); >> } >> jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); >> Atomic::release_store(&jmeths[idnum+1], new_id); >> return new_id; >> } >> >> jmethodID InstanceKlass::get_jmethod_id(const methodHandle& method_h) { >> Method* method = method_h(); >> int idnum = method_h->method_idnum(); >> jmethodID* jmeths = methods_jmethod_ids_acquire(); >> >> <... big comment ...> >> MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); >> if (jmeths == nullptr) { >> jmeths = methods_jmethod_ids_acquire(); >> >> if (jmeths == nullptr) { // Still null? >> size_t size = idnum_allocated_count(); >> assert(size > (size_t)idnum, "should already have space"); >> jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); >> memset(jmeths, 0, (size+1)*sizeof(jmethodID)); >> // cache size is stored in element[0], other elements offset by one >> jmeths[0] = (jmethodID)size; >> jmethodID new_id = update_jmethod_id(jmeths, method, idnum); >> >> // publish jmeths >> release_set_methods_jmethod_ids(jmeths); >> return new_id; >> } >> } >> jmethodID id = Atomic::load_acquire(&jmeths[idnum+1]); >> if (id == nullptr) { >> id = jmeths[idnum+1]; >> >> if (id == nullptr) { // Still null? >> jmethodID new_id = update_jmethod_id(jmeths, method, idnum); >> return new_id; >> } >> } >> return id; >> } > > Yes this refactoring looks nice. Nice to have only one place that checks for !is_obsolete. Thank you for the suggestion, I reran the jvmti tests locally. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1549733870 From duke at openjdk.org Wed Apr 3 13:58:23 2024 From: duke at openjdk.org (=?UTF-8?B?VG9tw6HFoQ==?= Zezula) Date: Wed, 3 Apr 2024 13:58:23 GMT Subject: RFR: JDK-8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. [v2] In-Reply-To: References: Message-ID: > Problem: > The debugging stack traces in `jdk.internal.vm.TranslatedException` do not work in libjvmci because they are enabled via the `jdk.internal.vm.TranslatedException.debug` system property. However, HotSpot system properties are not accessible via `System.getProperty()` in libjvmci. > > Fix: > The value of `jdk.internal.vm.TranslatedException.debug` is passed from the VM via a boolean flag to `VMSupport::decodeAndThrowThrowable()`. Tom?? Zezula has updated the pull request incrementally with one additional commit since the last revision: JDK-8329564: Fixed TestTranslatedException tests. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18591/files - new: https://git.openjdk.org/jdk/pull/18591/files/5e43f5f7..3a34ce27 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18591&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18591&range=00-01 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18591.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18591/head:pull/18591 PR: https://git.openjdk.org/jdk/pull/18591 From mli at openjdk.org Wed Apr 3 14:48:24 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 3 Apr 2024 14:48:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF Message-ID: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Hi, Can you help to review the patch? This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. Besides of the code changes, one important task is to handle the legal process. Thanks! ------------- Commit messages: - minor - add maintenance nodes - merge master - remove unnecessary changes - resolve build erorrs - add [generated] src from sleef - fix jni includes - rename - resolve magicus's comments - fix variable name in github workflow - ... and 13 more: https://git.openjdk.org/jdk/compare/6ae1cf12...3ab4795d Changes: https://git.openjdk.org/jdk/pull/18605/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312425 Stats: 14582 lines in 20 files changed: 14535 ins; 1 del; 46 mod Patch: https://git.openjdk.org/jdk/pull/18605.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18605/head:pull/18605 PR: https://git.openjdk.org/jdk/pull/18605 From mli at openjdk.org Wed Apr 3 14:52:20 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 3 Apr 2024 14:52:20 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Thu, 28 Mar 2024 18:41:03 GMT, Paul Sandoz wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix jni includes > > Hamlin, thank you for working on this. I think integrating a sub-set of SLEEF is valuable (not all of it makes sense e.g., DFT part). My recommendation would be to focus on a PR that integrates the required source, rather than taking steps towards that. > > AFAICT from browsing prior comments "integrate the source" appears to be the generally preferred solution, but there is some understandable hesitancy about legal aspects. IIUC from what you say this is a technically feasible and maintainable solution. As said here: > >> We (Oracle Java Platform Group) can handle the required "paperwork > https://github.com/openjdk/jdk/pull/16234#issuecomment-1823335443 > Thanks @PaulSandoz for your comment and suggestion. > > I will work on the solution which integrates the source of sleef into jdk. I've created a new pr at https://github.com/openjdk/jdk/pull/18605 which is to integrate the sleef source into jdk so to avoid depending on external sleef at build or run time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2034832611 From mli at openjdk.org Wed Apr 3 14:52:20 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 3 Apr 2024 14:52:20 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <9fK9M5yFWEfVRYjUr7S5Xu3cLu3yWiydif6TaUROZZg=.57dfbf78-50f6-4541-afa6-3b6b6f0286e7@github.com> On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes Hey everyone, Please review the change at https://github.com/openjdk/jdk/pull/18605 when you're available. This pr will be closed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2034835163 From mli at openjdk.org Wed Apr 3 14:52:20 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 3 Apr 2024 14:52:20 GMT Subject: Withdrawn: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 08:48:11 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > Thanks > > This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: > * at build time > * with/without sleef > * with/without sve support > * at runtime > * with/without sleef > * with/without sve support > > [1] https://github.com/openjdk/jdk/pull/16234 > > ## Regression Test > * test/jdk/jdk/incubator/vector/ > * test/hotspot/jtreg/compiler/vectorapi/ > > ## Performance Test > Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/18294 From sgibbons at openjdk.org Wed Apr 3 15:15:24 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Wed, 3 Apr 2024 15:15:24 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v4] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix Windows ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/3aa60a48..8bed1561 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=02-03 Stats: 8 lines in 1 file changed: 6 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From dnsimon at openjdk.org Wed Apr 3 15:39:59 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 3 Apr 2024 15:39:59 GMT Subject: RFR: JDK-8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. [v2] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 13:58:23 GMT, Tom?? Zezula wrote: >> Problem: >> The debugging stack traces in `jdk.internal.vm.TranslatedException` do not work in libjvmci because they are enabled via the `jdk.internal.vm.TranslatedException.debug` system property. However, HotSpot system properties are not accessible via `System.getProperty()` in libjvmci. >> >> Fix: >> The value of `jdk.internal.vm.TranslatedException.debug` is passed from the VM via a boolean flag to `VMSupport::decodeAndThrowThrowable()`. > > Tom?? Zezula has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8329564: Fixed TestTranslatedException tests. Marked as reviewed by dnsimon (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18591#pullrequestreview-1977216173 From stefank at openjdk.org Wed Apr 3 16:03:14 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Apr 2024 16:03:14 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: Message-ID: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> On Mon, 1 Apr 2024 21:07:31 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Removed not_used state of nmethod Nice! We've wanted to clean up some interfaces between the CodeCache and the GC code by using nmethod closures instead of CodeBlob closures. This change (and the Sweeper removal) makes it possible to do those cleanups. I've made a superficial pass over the patch to and left a few comments. Most of those comments are things that would be nice to fix, but could also be left as follow-up RFEs (if they are deemed to be worthy ideas to pursue). src/hotspot/os/posix/signals_posix.cpp line 27: > 25: #include "precompiled.hpp" > 26: #include "code/codeCache.hpp" > 27: #include "code/nmethod.hpp" The include line needs to move down. src/hotspot/share/code/codeBlob.hpp line 77: > 75: // - data space > 76: > 77: enum CodeBlobKind : u1 { It will probably be safer to change this to an enum class, so that the compiler will help us if we mess up with the argument order when this is used in function calls. I see that this patch switches the parameter order of some functions, so I think it could be worth trying out. src/hotspot/share/code/codeBlob.hpp line 409: > 407: > 408: // GC/Verification support > 409: virtual void preserve_callee_argument_oops(frame fr, const RegisterMap *reg_map, OopClosure* f) override { /* nothing to do */ } In the GC code we usually have either virtual OR override, but not both. Could we skip `virtual` here? Or does the compiler code usually use both? src/hotspot/share/code/codeBlob.hpp line 429: > 427: SingletonBlob( > 428: const char* name, > 429: CodeBlobKind kind, There's an alignment issue after this change. src/hotspot/share/code/codeCache.cpp line 1009: > 1007: int CodeCache::nmethod_count() { > 1008: int count = 0; > 1009: for (GrowableArrayIterator heap = _nmethod_heaps->begin(); heap != _nmethod_heaps->end(); ++heap) { Is there a reason why FOR_ALL_NMETHOD_HEAPS wasn't good fit here? I'm wondering since the similar `CodeCache::blob_count()` still uses one of these macros. src/hotspot/share/code/nmethod.cpp line 812: > 810: // By calling this nmethod entry barrier, it plays along and acts > 811: // like any other nmethod found on the stack of a thread (fewer surprises). > 812: nmethod* nm = as_nmethod_or_null(); Calling as_nmethod_or_null() from within functions in the nmethod class is suspicious. Shouldn't all such usages be removed? (I'm fine with doing that as a separate change) src/hotspot/share/code/nmethod.cpp line 1009: > 1007: // Fill in default values for various flag fields > 1008: void nmethod::init_defaults() { > 1009: { // avoid uninitialized fields, even for short time periods Should these curly braces be removed? src/hotspot/share/code/nmethod.cpp line 2164: > 2162: DTRACE_METHOD_UNLOAD_PROBE(method()); > 2163: > 2164: // If a JVMTI agent has enabled the nmethod Unload event then I think the event is still called CompiledMethodUnload, so this line should probably be reverted. src/hotspot/share/code/nmethod.hpp line 50: > 48: class ScopeDesc; > 49: class CompiledIC; > 50: class MetadataClosure; Maybe merge (and sort) this together with the other forward declarations? src/hotspot/share/code/nmethod.hpp line 905: > 903: > 904: // printing support > 905: void print() const override; Here and a few other places you only use override and not also virtual. This is inconsistent with other functions in this class. (FWIW, I prefer this style with only the override qualifier). src/hotspot/share/code/nmethod.inline.hpp line 60: > 58: // (b) it is a deopt PC > 59: > 60: inline address nmethod::get_deopt_original_pc(const frame* fr) { While reading this PR I wonder if this really belongs in the `nmethod` class or if it would make more sense to have it as a member function in the `frame` class. It is a static function, which uses `fr` sort-of like a `this` pointer. Maybe something to consider for a separate RFE. src/hotspot/share/code/relocInfo.hpp line 39: > 37: class nmethod; > 38: class CodeBlob; > 39: class nmethod; You already have a class nmethod forward declaration above. src/hotspot/share/compiler/compileBroker.cpp line 1379: > 1377: if (osr_bci == InvocationEntryBci) { > 1378: // standard compilation > 1379: nmethod* method_code = method->code(); Isn't the `method_code->is_nmethod()` redundant now? src/hotspot/share/compiler/compileBroker.cpp line 1484: > 1482: // We accept a higher level osr method > 1483: if (osr_bci == InvocationEntryBci) { > 1484: nmethod* code = method->code(); Cast below is redundant. src/hotspot/share/gc/g1/g1HeapRegion.cpp line 339: > 337: > 338: void do_code_blob(CodeBlob* cb) { > 339: nmethod* nm = (cb == nullptr) ? nullptr : cb->as_nmethod_or_null(); After this change I'd like to see if we can change this and similar GC interfaces to work directly against `nmethod` instead of `CodeBlob`. src/hotspot/share/gc/shared/parallelCleaning.cpp line 54: > 52: void CodeCacheUnloadingTask::claim_nmethods(nmethod** claimed_nmethods, int *num_claimed_nmethods) { > 53: nmethod* first; > 54: NMethodIterator last(NMethodIterator::all_blobs); FWIW, `all_blobs` is slightly confusing name when nmethods are a subset of all "code blobs". We might want to consider renaming this to `NMethodIterator::all` (in a separate RFE). src/hotspot/share/gc/x/xUnload.cpp line 78: > 76: class XIsUnloadingBehaviour : public IsUnloadingBehaviour { > 77: public: > 78: virtual bool has_dead_oop(nmethod* method) const { This now takes an `nmethod` argument, but still calls as_nmethod(). I think that should be removed from this, and all similar functions here in the GC code. If you want, I can do that as a follow-up RFE. src/hotspot/share/jvmci/jvmciRuntime.cpp line 271: > 269: > 270: nm = CodeCache::find_nmethod(pc); > 271: assert(nm != nullptr, "this is not a compiled method"); Unrelated to this patch, but might be worth mentioning because I think it would be good to think about this in a follow-up RFE. `CodeCache::find_blob` returns null if it can't find a matching blob, but `CodeCache::find_nmethod` asserts that it did find one. The latter makes the assert redundant, but it also begs to question if `find_blob` and `find_nmethod` realy should be different in this regard? Should we have `find_blob_or_null` and `find_nmethod_or_null`? Alt. `find_blob_not_null` and `find_nmethod_not_null`. src/hotspot/share/prims/whitebox.cpp line 772: > 770: if (_make_not_entrant) { > 771: nmethod* nm = CodeCache::find_nmethod(f->pc()); > 772: assert(nm != nullptr, "sanity check"); This assert is now redundant. src/hotspot/share/prims/whitebox.cpp line 1100: > 1098: // Check code again because compilation may be finished before Compile_lock is acquired. > 1099: if (bci == InvocationEntryBci) { > 1100: nmethod* code = mh->code(); `as_nmethod_or_null()` below should be redundant. src/hotspot/share/runtime/continuationEntry.hpp line 35: > 33: #include CPU_HEADER(continuationEntry) > 34: > 35: class nmethod; Maybe keep the forward declarations sorted? src/hotspot/share/runtime/continuationEntry.hpp line 59: > 57: public: > 58: static int _return_pc_offset; // friend gen_continuation_enter > 59: static void set_enter_code(nmethod* cm, int interpreted_entry_offset); cm => nm? src/hotspot/share/runtime/frame.cpp line 208: > 206: address frame::raw_pc() const { > 207: if (is_deoptimized_frame()) { > 208: nmethod* nm = cb()->as_nmethod_or_null(); Prexisting: It's weird that this code is using the `_or_null()` version when the code below does not null check the returned value. src/hotspot/share/runtime/vframe.cpp line 75: > 73: if (cb != nullptr) { > 74: if (cb->is_nmethod()) { > 75: nmethod* nm = cb->as_nmethod();; There's two `;`s on this line. ------------- PR Review: https://git.openjdk.org/jdk/pull/18554#pullrequestreview-1977027234 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549873079 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549892107 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549895707 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549917406 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549925499 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549949611 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549954407 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549955841 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549958927 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549968764 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549974539 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549975722 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549977276 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549978007 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549979750 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549983251 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549985971 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549992086 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1549999765 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550002167 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550005072 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550006055 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550013866 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550025081 From kvn at openjdk.org Wed Apr 3 16:18:12 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 16:18:12 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <86hdErNCggxb7O-j9AYmcR9IV7M15p1Hnrowo4nDk_U=.1b2b07cd-209b-4181-bc97-58d1a8fac674@github.com> References: <86hdErNCggxb7O-j9AYmcR9IV7M15p1Hnrowo4nDk_U=.1b2b07cd-209b-4181-bc97-58d1a8fac674@github.com> Message-ID: <8Eb7EtxdXdDbbgICKduteW4cHuINHpbQevdPJ7XBvYY=.594146f6-87f3-41ed-86a7-3e6541d67286@github.com> On Wed, 3 Apr 2024 02:55:52 GMT, Vladimir Ivanov wrote: > What about `CompiledMethod_lock`? There's no `CompiledMethod` anymore, but the lock name still refers to it. It was different changes [JDK-8226705](https://bugs.openjdk.org/browse/JDK-8226705). Renaming it will complicate these changes more than I wanted. I can do it in separate RFE. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2035031413 From sspitsyn at openjdk.org Wed Apr 3 16:27:12 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 3 Apr 2024 16:27:12 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 13:25:36 GMT, Coleen Phillimore wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring suggested by Serguei. Looks good to me. Nice simplification. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18549#pullrequestreview-1977358735 From kvn at openjdk.org Wed Apr 3 16:32:12 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 16:32:12 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 14:44:03 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > src/hotspot/share/code/codeBlob.hpp line 409: > >> 407: >> 408: // GC/Verification support >> 409: virtual void preserve_callee_argument_oops(frame fr, const RegisterMap *reg_map, OopClosure* f) override { /* nothing to do */ } > > In the GC code we usually have either virtual OR override, but not both. Could we skip `virtual` here? Or does the compiler code usually use both? No special rules here. I simply want to see all `virtual` methods explicitly and `override` is required by C++. I would like to keep it this way in these changes. I am investigating possibility to convert all these virtual methods to normal one to remove virtual table and virtual pointer (8 bytes) from CodeBlob class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550071713 From duke at openjdk.org Wed Apr 3 16:37:17 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 3 Apr 2024 16:37:17 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore [v4] In-Reply-To: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> References: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Message-ID: > Replace vpblendvp[sd] with macro assembler call and test in: > - `C2_MacroAssembler::vector_cast_float_to_int_special_cases_avx` (insufficient registers for 1 of 2 blends) > - `C2_MacroAssembler::vector_cast_double_to_int_special_cases_avx` > - `C2_MacroAssembler::vector_count_leading_zeros_int_avx` > > Functional testing with existing and new tests: > `make test TEST="test/hotspot/jtreg/compiler/vectorapi/reshape test/hotspot/jtreg/compiler/vectorization/runner/BasicIntOpTest.java"` > > Benchmarking with existing and new tests: > > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microFloat256ToInteger256" > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microDouble256ToInteger256" > make test TEST="micro:org.openjdk.bench.vm.compiler.VectorBitCount.WithSuperword.intLeadingZeroCount" > > > Performance before: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 17271.078 ? 184.140 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9310.507 ? 88.136 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11137.594 ? 19.009 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 5425.001 ? 3.136 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 0.994 ? 0.002 ops/us > > > Performance after: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 19222.048 ? 87.622 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9233.245 ? 123.493 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11672.806 ? 10.854 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 6009.735 ? 12.173 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 1.039 ? 0.004 ops/us Volodymyr Paprotski has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge remote-tracking branch 'jdk/master' into vp-blend - remove trailing whitespace - Allow scratch to overlap with src1|src2 - Fix double pasted test - Fix whitespace - vpblend emulation continued ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18310/files - new: https://git.openjdk.org/jdk/pull/18310/files/1705a6aa..0d3feee9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=02-03 Stats: 368215 lines in 3112 files changed: 22945 ins; 19136 del; 326134 mod Patch: https://git.openjdk.org/jdk/pull/18310.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18310/head:pull/18310 PR: https://git.openjdk.org/jdk/pull/18310 From kvn at openjdk.org Wed Apr 3 16:40:59 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 16:40:59 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 15:01:22 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > src/hotspot/share/code/codeCache.cpp line 1009: > >> 1007: int CodeCache::nmethod_count() { >> 1008: int count = 0; >> 1009: for (GrowableArrayIterator heap = _nmethod_heaps->begin(); heap != _nmethod_heaps->end(); ++heap) { > > Is there a reason why FOR_ALL_NMETHOD_HEAPS wasn't good fit here? I'm wondering since the similar `CodeCache::blob_count()` still uses one of these macros. No, `CodeCache::blob_count()` uses different macro `FOR_ALL_HEAPS(heap)` because it looks for all code blobs, not only nmethods. `CodeCache::nmethod_count()` is the only place where `FOR_ALL_NMETHOD_HEAPS ` was used. So I decided to remove the macro. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550087255 From bulasevich at openjdk.org Wed Apr 3 16:47:32 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 3 Apr 2024 16:47:32 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v12] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments ------------- Changes: https://git.openjdk.org/jdk/pull/17244/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=11 Stats: 325 lines in 5 files changed: 175 ins; 99 del; 51 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From kvn at openjdk.org Wed Apr 3 17:01:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 17:01:10 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 15:12:31 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > src/hotspot/share/code/nmethod.cpp line 812: > >> 810: // By calling this nmethod entry barrier, it plays along and acts >> 811: // like any other nmethod found on the stack of a thread (fewer surprises). >> 812: nmethod* nm = as_nmethod_or_null(); > > Calling as_nmethod_or_null() from within functions in the nmethod class is suspicious. Shouldn't all such usages be removed? (I'm fine with doing that as a separate change) Good catch! The code was moved from CompiledMethod where it made sense but now it is not needed. Here the change I will make: // like any other nmethod found on the stack of a thread (fewer surprises). - nmethod* nm = as_nmethod_or_null(); - if (nm != nullptr && bs_nm->is_armed(nm)) { + nmethod* nm = this; + if (bs_nm->is_armed(nm)) { bool alive = bs_nm->nmethod_entry_barrier(nm); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550118967 From kvn at openjdk.org Wed Apr 3 17:38:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 17:38:01 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 15:30:00 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > src/hotspot/share/compiler/compileBroker.cpp line 1379: > >> 1377: if (osr_bci == InvocationEntryBci) { >> 1378: // standard compilation >> 1379: nmethod* method_code = method->code(); > > Isn't the `method_code->is_nmethod()` redundant now? An other good catch! It leads me to chase all redundant `is_nmethod()` and `as_nmethod_*()` calls. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550187397 From stefank at openjdk.org Wed Apr 3 17:53:01 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Apr 2024 17:53:01 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: <5V3m6SFRs8kib1w_Oyyrvr-Yu904MuuPQglr-uei_So=.60542853-0800-4113-b4f0-57d328c5d866@github.com> On Wed, 3 Apr 2024 16:29:03 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/code/codeBlob.hpp line 409: >> >>> 407: >>> 408: // GC/Verification support >>> 409: virtual void preserve_callee_argument_oops(frame fr, const RegisterMap *reg_map, OopClosure* f) override { /* nothing to do */ } >> >> In the GC code we usually have either virtual OR override, but not both. Could we skip `virtual` here? Or does the compiler code usually use both? > > No special rules here. I simply want to see all `virtual` methods explicitly and `override` is required by C++. > I would like to keep it this way in these changes. I am investigating possibility to convert all these virtual methods to normal one to remove virtual table and virtual pointer (8 bytes) from CodeBlob class. `override` is not required by C++. You do however mark all virtual methods with `override` if any of the functions are marked with `override`. I think it would be good to have a HotSpot code style discussion about this (but not in this RFE). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550206804 From stefank at openjdk.org Wed Apr 3 17:59:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 3 Apr 2024 17:59:09 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 16:38:13 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/code/codeCache.cpp line 1009: >> >>> 1007: int CodeCache::nmethod_count() { >>> 1008: int count = 0; >>> 1009: for (GrowableArrayIterator heap = _nmethod_heaps->begin(); heap != _nmethod_heaps->end(); ++heap) { >> >> Is there a reason why FOR_ALL_NMETHOD_HEAPS wasn't good fit here? I'm wondering since the similar `CodeCache::blob_count()` still uses one of these macros. > > No, `CodeCache::blob_count()` uses different macro `FOR_ALL_HEAPS(heap)` because it looks for all code blobs, not only nmethods. > > `CodeCache::nmethod_count()` is the only place where `FOR_ALL_NMETHOD_HEAPS ` was used. So I decided to remove the macro. I didn't say that blob_count used `FOR_ALL_NMETHODS_HEAP`. I wrote "one of these macros". I still think this adds an inconsistency to the code that I don't think is beneficial. With that said, can't this be written as: for (CodeHeap* heap : *_nmethod_heaps) { Maybe yet another opportunity for cleanups. >> src/hotspot/share/code/nmethod.cpp line 812: >> >>> 810: // By calling this nmethod entry barrier, it plays along and acts >>> 811: // like any other nmethod found on the stack of a thread (fewer surprises). >>> 812: nmethod* nm = as_nmethod_or_null(); >> >> Calling as_nmethod_or_null() from within functions in the nmethod class is suspicious. Shouldn't all such usages be removed? (I'm fine with doing that as a separate change) > > Good catch! The code was moved from CompiledMethod where it made sense but now it is not needed. Here the change I will make: > > // like any other nmethod found on the stack of a thread (fewer surprises). > - nmethod* nm = as_nmethod_or_null(); > - if (nm != nullptr && bs_nm->is_armed(nm)) { > + nmethod* nm = this; > + if (bs_nm->is_armed(nm)) { > bool alive = bs_nm->nmethod_entry_barrier(nm); Sounds good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550216628 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550217712 From kvn at openjdk.org Wed Apr 3 18:20:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 18:20:01 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 15:49:00 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > src/hotspot/share/runtime/frame.cpp line 208: > >> 206: address frame::raw_pc() const { >> 207: if (is_deoptimized_frame()) { >> 208: nmethod* nm = cb()->as_nmethod_or_null(); > > Prexisting: It's weird that this code is using the `_or_null()` version when the code below does not null check the returned value. Before [JDK-6921352](https://bugs.openjdk.org/browse/JDK-6921352) it was: return ((nmethod*) cb())->deopt_handler_begin() - pc_return_offset; I will add assert with check for null. We definitely expect here only nmethod. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550243895 From amenkov at openjdk.org Wed Apr 3 18:31:09 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 3 Apr 2024 18:31:09 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 13:25:36 GMT, Coleen Phillimore wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring suggested by Serguei. Still looks good ------------- Marked as reviewed by amenkov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18549#pullrequestreview-1977650540 From kvn at openjdk.org Wed Apr 3 18:50:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 18:50:10 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 15:35:49 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > src/hotspot/share/gc/x/xUnload.cpp line 78: > >> 76: class XIsUnloadingBehaviour : public IsUnloadingBehaviour { >> 77: public: >> 78: virtual bool has_dead_oop(nmethod* method) const { > > This now takes an `nmethod` argument, but still calls as_nmethod(). I think that should be removed from this, and all similar functions here in the GC code. If you want, I can do that as a follow-up RFE. I decided to fix it in these changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550285238 From kvn at openjdk.org Wed Apr 3 18:55:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 18:55:00 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 16:00:01 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed not_used state of nmethod > > Nice! > > We've wanted to clean up some interfaces between the CodeCache and the GC code by using nmethod closures instead of CodeBlob closures. This change (and the Sweeper removal) makes it possible to do those cleanups. > > I've made a superficial pass over the patch to and left a few comments. Most of those comments are things that would be nice to fix, but could also be left as follow-up RFEs (if they are deemed to be worthy ideas to pursue). Thank you, @stefank, for great review. I addressed all your comments locally and with run testing in mach5 before pushing it. Except your suggestion about `find_blob_not_null()` - should be separate RFE. The same for suggestion "GC interfaces to work directly against nmethod instead of CodeBlob". ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2035354740 From ihse at openjdk.org Wed Apr 3 19:26:00 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 3 Apr 2024 19:26:00 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Wed, 3 Apr 2024 14:40:42 GMT, Hamlin Li wrote: > Hi, > Can you help to review the patch? > This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). > > Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. > Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. > > Besides of the code changes, one important task is to handle the legal process. > > Thanks! Just a quick question after giving this a glance: My understanding was that the normal libsleef build set a lot of compiler options, e.g. disabling built-in maths etc. You don't seem to set any of these. Have you determined that they were not needed? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2035409207 From kvn at openjdk.org Wed Apr 3 19:32:59 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 19:32:59 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: Message-ID: On Mon, 1 Apr 2024 21:07:31 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Removed not_used state of nmethod I filed RFEs: [JDK-8329628](https://bugs.openjdk.org/browse/JDK-8329628): Additional changes after JDK-8329332 [JDK-8329629](https://bugs.openjdk.org/browse/JDK-8329629): GC interfaces should work directly against nmethod instead of CodeBlob ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2035421134 From duke at openjdk.org Wed Apr 3 19:57:21 2024 From: duke at openjdk.org (=?UTF-8?B?VG9tw6HFoQ==?= Zezula) Date: Wed, 3 Apr 2024 19:57:21 GMT Subject: RFR: JDK-8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. [v3] In-Reply-To: References: Message-ID: > Problem: > The debugging stack traces in `jdk.internal.vm.TranslatedException` do not work in libjvmci because they are enabled via the `jdk.internal.vm.TranslatedException.debug` system property. However, HotSpot system properties are not accessible via `System.getProperty()` in libjvmci. > > Fix: > The value of `jdk.internal.vm.TranslatedException.debug` is passed from the VM via a boolean flag to `VMSupport::decodeAndThrowThrowable()`. Tom?? Zezula has updated the pull request incrementally with one additional commit since the last revision: JDK-8329564: Fixed vmSymbols.hpp formatting. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18591/files - new: https://git.openjdk.org/jdk/pull/18591/files/3a34ce27..de30637b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18591&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18591&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18591.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18591/head:pull/18591 PR: https://git.openjdk.org/jdk/pull/18591 From kvn at openjdk.org Wed Apr 3 19:59:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 19:59:01 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: <5V3m6SFRs8kib1w_Oyyrvr-Yu904MuuPQglr-uei_So=.60542853-0800-4113-b4f0-57d328c5d866@github.com> References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> <5V3m6SFRs8kib1w_Oyyrvr-Yu904MuuPQglr-uei_So=.60542853-0800-4113-b4f0-57d328c5d866@github.com> Message-ID: On Wed, 3 Apr 2024 17:50:15 GMT, Stefan Karlsson wrote: >> No special rules here. I simply want to see all `virtual` methods explicitly and `override` is required by C++. >> I would like to keep it this way in these changes. I am investigating possibility to convert all these virtual methods to normal one to remove virtual table and virtual pointer (8 bytes) from CodeBlob class. > > `override` is not required by C++. You do however mark all virtual methods with `override` if any of the functions are marked with `override`. I think it would be good to have a HotSpot code style discussion about this (but not in this RFE). I put `virtual/override` cleanup in CodeBlob as additional suggestion in followup RFE JDK-8329628. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550370826 From kvn at openjdk.org Wed Apr 3 20:03:11 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 3 Apr 2024 20:03:11 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v2] In-Reply-To: References: <2MTKEcGChvwWqFFuPs-5TR8GJnLwrnrgGKqdIV4NX70=.d0a707d4-5dfc-467c-992d-d410f94a7dca@github.com> Message-ID: On Wed, 3 Apr 2024 17:55:38 GMT, Stefan Karlsson wrote: >> No, `CodeCache::blob_count()` uses different macro `FOR_ALL_HEAPS(heap)` because it looks for all code blobs, not only nmethods. >> >> `CodeCache::nmethod_count()` is the only place where `FOR_ALL_NMETHOD_HEAPS ` was used. So I decided to remove the macro. > > I didn't say that blob_count used `FOR_ALL_NMETHODS_HEAP`. I wrote "one of these macros". I still think this adds an inconsistency to the code that I don't think is beneficial. > > With that said, can't this be written as: > > for (CodeHeap* heap : *_nmethod_heaps) { > > > Maybe yet another opportunity for cleanups. I like it and will do it in JDK-8329628. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1550378690 From dnsimon at openjdk.org Wed Apr 3 20:07:59 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 3 Apr 2024 20:07:59 GMT Subject: RFR: JDK-8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. [v3] In-Reply-To: References: Message-ID: <1H-n6VsDZdR3-2Yhs4jnZPys7_pf-5NfqprTtAWfpGk=.7cc71477-2d6c-4369-969d-a517ca4481c1@github.com> On Wed, 3 Apr 2024 19:57:21 GMT, Tom?? Zezula wrote: >> Problem: >> The debugging stack traces in `jdk.internal.vm.TranslatedException` do not work in libjvmci because they are enabled via the `jdk.internal.vm.TranslatedException.debug` system property. However, HotSpot system properties are not accessible via `System.getProperty()` in libjvmci. >> >> Fix: >> The value of `jdk.internal.vm.TranslatedException.debug` is passed from the VM via a boolean flag to `VMSupport::decodeAndThrowThrowable()`. > > Tom?? Zezula has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8329564: Fixed vmSymbols.hpp formatting. Marked as reviewed by dnsimon (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18591#pullrequestreview-1977858058 From duke at openjdk.org Wed Apr 3 20:14:11 2024 From: duke at openjdk.org (=?UTF-8?B?VG9tw6HFoQ==?= Zezula) Date: Wed, 3 Apr 2024 20:14:11 GMT Subject: Integrated: JDK-8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 07:02:31 GMT, Tom?? Zezula wrote: > Problem: > The debugging stack traces in `jdk.internal.vm.TranslatedException` do not work in libjvmci because they are enabled via the `jdk.internal.vm.TranslatedException.debug` system property. However, HotSpot system properties are not accessible via `System.getProperty()` in libjvmci. > > Fix: > The value of `jdk.internal.vm.TranslatedException.debug` is passed from the VM via a boolean flag to `VMSupport::decodeAndThrowThrowable()`. This pull request has now been integrated. Changeset: 8267d656 Author: Tomas Zezula Committer: Doug Simon URL: https://git.openjdk.org/jdk/commit/8267d6565d17c8db8f5b50a37482610ffe0a8a5c Stats: 32 lines in 5 files changed: 8 ins; 1 del; 23 mod 8329564: [JVMCI] TranslatedException::debugPrintStackTrace does not work in the libjvmci compiler. Reviewed-by: dnsimon ------------- PR: https://git.openjdk.org/jdk/pull/18591 From dcubed at openjdk.org Wed Apr 3 22:09:11 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 3 Apr 2024 22:09:11 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: References: Message-ID: <9wpF-UcKkG-ycqj4ZmamdW7ZwPG9ymnStT555X4_5ck=.bb747b4b-621f-4275-9180-8f432685cf74@github.com> On Wed, 3 Apr 2024 13:25:36 GMT, Coleen Phillimore wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring suggested by Serguei. Okay I've crawled thru the changes twice and I went back thru the bug history for this code and added some notes and links to the bug ID. Modulo the nits that I flagged, I think the changes are fine. Making cache growth only happen in the RedefineClasses safepoint is definite improvement. I see that you've run JVM/TI and JLI tests. You should also run JDI tests. Basically for a low level fix like this that affects JVM/TI, you should run Mach5 Tier[1-6]. src/hotspot/share/oops/instanceKlass.cpp line 2272: > 2270: jmethodID InstanceKlass::update_jmethod_id(jmethodID* jmeths, Method* method, int idnum) { > 2271: if (method->is_old() && !method->is_obsolete()) { > 2272: // If the method passed in is old (but not obsolete), use the current version nit: should end with a period. src/hotspot/share/oops/instanceKlass.cpp line 2277: > 2275: } > 2276: jmethodID new_id = Method::make_jmethod_id(class_loader_data(), method); > 2277: Atomic::release_store(&jmeths[idnum+1], new_id); nit: spaces around operator `+` src/hotspot/share/oops/instanceKlass.cpp line 2304: > 2302: // > 2303: // If the RedefineClasses() API has been used, then this cache grows > 2304: // in the redefinition safepoint. Much easier to reason about. Thanks for simplifying it. src/hotspot/share/oops/instanceKlass.cpp line 2314: > 2312: assert(size > (size_t)idnum, "should already have space"); > 2313: jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); > 2314: memset(jmeths, 0, (size+1)*sizeof(jmethodID)); nit: spaces around operator `+` (two places) nit: spaces around operator `*` src/hotspot/share/oops/instanceKlass.cpp line 2325: > 2323: } > 2324: > 2325: jmethodID id = Atomic::load_acquire(&jmeths[idnum+1]); nit: spaces around operator `+` src/hotspot/share/oops/instanceKlass.cpp line 2328: > 2326: if (id == nullptr) { > 2327: MutexLocker ml(JmethodIdCreation_lock, Mutex::_no_safepoint_check_flag); > 2328: id = jmeths[idnum+1]; nit: spaces around operator `+` src/hotspot/share/oops/instanceKlass.cpp line 2343: > 2341: size_t size = idnum_allocated_count(); > 2342: size_t old_size = (size_t)cache[0]; > 2343: if (old_size < size+1) { nit: spaces around operator `+` src/hotspot/share/oops/instanceKlass.cpp line 2344: > 2342: size_t old_size = (size_t)cache[0]; > 2343: if (old_size < size+1) { > 2344: // allocate a larger one and copy entries to the new one. nit typo: s/allocate/Allocate/ src/hotspot/share/oops/instanceKlass.cpp line 2345: > 2343: if (old_size < size+1) { > 2344: // allocate a larger one and copy entries to the new one. > 2345: // They've already been updated to point to new methods where applicable (ie. not obsolete) nit typo: s/ie./i.e.,/ Please add a period at the end of the sentence. src/hotspot/share/oops/instanceKlass.cpp line 2347: > 2345: // They've already been updated to point to new methods where applicable (ie. not obsolete) > 2346: jmethodID* new_cache = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); > 2347: memset(new_cache, 0, (size+1)*sizeof(jmethodID)); nit: spaces around operator `+` (two places) nit: spaces around operator `*` src/hotspot/share/oops/instanceKlass.cpp line 2348: > 2346: jmethodID* new_cache = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); > 2347: memset(new_cache, 0, (size+1)*sizeof(jmethodID)); > 2348: // cache size is stored in element[0], other elements offset by one nit typo: s/cache/Cache/ Please add a period at the end. src/hotspot/share/oops/instanceKlass.cpp line 2384: > 2382: int idnum = method->method_idnum(); > 2383: jmethodID* jmeths = methods_jmethod_ids_acquire(); > 2384: return (jmeths != nullptr) ? jmeths[idnum+1] : nullptr; nit: spaces around operator `+` src/hotspot/share/oops/method.cpp line 2200: > 2198: > 2199: ResourceMark rm; > 2200: log_info(jmethod)("Creating jmethodID for Method %s", m->external_name()); Hmmm... will this be too noisy for `info` level? src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 4353: > 4351: the_class->itable().initialize_itable(); > 4352: > 4353: // Update jmethodID cache if present Nit: should end with period. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18549#pullrequestreview-1977965153 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550473408 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550476292 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550488626 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550492405 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550491163 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550492871 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550499076 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550499498 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550500350 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550501547 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550503608 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550505968 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550458475 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550449234 From coleenp at openjdk.org Wed Apr 3 23:05:24 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 23:05:24 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v3] In-Reply-To: References: Message-ID: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix spacing and punctuation. make log_info into log_debug. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18549/files - new: https://git.openjdk.org/jdk/pull/18549/files/6576d14d..26bd82d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=01-02 Stats: 13 lines in 3 files changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18549/head:pull/18549 PR: https://git.openjdk.org/jdk/pull/18549 From coleenp at openjdk.org Wed Apr 3 23:05:24 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 23:05:24 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 13:25:36 GMT, Coleen Phillimore wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Refactoring suggested by Serguei. Thanks for reviewing this Dan. I've fixed the nits you pointed out. I ran tier1-4 which includes JDI tests, but I'll run 5, 6 tonight. ------------- PR Review: https://git.openjdk.org/jdk/pull/18549#pullrequestreview-1978241218 PR Comment: https://git.openjdk.org/jdk/pull/18549#issuecomment-2035752492 From coleenp at openjdk.org Wed Apr 3 23:05:24 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 3 Apr 2024 23:05:24 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v2] In-Reply-To: <9wpF-UcKkG-ycqj4ZmamdW7ZwPG9ymnStT555X4_5ck=.bb747b4b-621f-4275-9180-8f432685cf74@github.com> References: <9wpF-UcKkG-ycqj4ZmamdW7ZwPG9ymnStT555X4_5ck=.bb747b4b-621f-4275-9180-8f432685cf74@github.com> Message-ID: On Wed, 3 Apr 2024 20:46:55 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Refactoring suggested by Serguei. > > src/hotspot/share/oops/method.cpp line 2200: > >> 2198: >> 2199: ResourceMark rm; >> 2200: log_info(jmethod)("Creating jmethodID for Method %s", m->external_name()); > > Hmmm... will this be too noisy for `info` level? I forget that there's a default where -Xlog nothing will print all the info level messages. I changed it to debug. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550615588 From dcubed at openjdk.org Wed Apr 3 23:55:01 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 3 Apr 2024 23:55:01 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v3] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 23:05:24 GMT, Coleen Phillimore wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix spacing and punctuation. make log_info into log_debug. Thanks for the fixes. There are a couple that you missed. src/hotspot/share/oops/instanceKlass.cpp line 2313: > 2311: size_t size = idnum_allocated_count(); > 2312: assert(size > (size_t)idnum, "should already have space"); > 2313: jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); nit: spaces around operator `+` src/hotspot/share/oops/instanceKlass.cpp line 2346: > 2344: // Allocate a larger one and copy entries to the new one. > 2345: // They've already been updated to point to new methods where applicable (i.e., not obsolete). > 2346: jmethodID* new_cache = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); nit: spaces around operator `+` ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18549#pullrequestreview-1978305784 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550653423 PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550653504 From kvn at openjdk.org Thu Apr 4 00:05:20 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 00:05:20 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: Message-ID: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Address comments - Merge branch 'master' into 8329332 - Removed not_used state of nmethod - remove trailing whitespace - 8329332: Remove CompiledMethod and CodeBlobLayout classes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18554/files - new: https://git.openjdk.org/jdk/pull/18554/files/246ff68a..33768fb2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=01-02 Stats: 9283 lines in 197 files changed: 3058 ins; 4514 del; 1711 mod Patch: https://git.openjdk.org/jdk/pull/18554.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18554/head:pull/18554 PR: https://git.openjdk.org/jdk/pull/18554 From coleenp at openjdk.org Thu Apr 4 00:07:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Apr 2024 00:07:34 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v4] In-Reply-To: References: Message-ID: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Two more. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18549/files - new: https://git.openjdk.org/jdk/pull/18549/files/26bd82d3..aab60e28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18549&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18549.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18549/head:pull/18549 PR: https://git.openjdk.org/jdk/pull/18549 From coleenp at openjdk.org Thu Apr 4 00:07:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Apr 2024 00:07:34 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v3] In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 23:50:47 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix spacing and punctuation. make log_info into log_debug. > > src/hotspot/share/oops/instanceKlass.cpp line 2313: > >> 2311: size_t size = idnum_allocated_count(); >> 2312: assert(size > (size_t)idnum, "should already have space"); >> 2313: jmeths = NEW_C_HEAP_ARRAY(jmethodID, size+1, mtClass); > > nit: spaces around operator `+` Got them. The good thing about the smaller function is it's easier to find these. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18549#discussion_r1550659977 From kvn at openjdk.org Thu Apr 4 00:40:03 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 00:40:03 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: On Thu, 4 Apr 2024 00:05:20 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Address comments > - Merge branch 'master' into 8329332 > - Removed not_used state of nmethod > - remove trailing whitespace > - 8329332: Remove CompiledMethod and CodeBlobLayout classes GHA `linux-x64-hs-minimal` failure is not related to changes: 2024-04-04T00:07:46.9654262Z ##[warning]Failed to download action 'https://api.github.com/repos/actions/github-script/tarball/60a0d83039c74a4aee543508d2ffcb1c3799cdea'. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. 2024-04-04T00:07:46.9656929Z ##[warning]Back off 22.252 seconds before retry. 2024-04-04T00:08:52.1252710Z ##[error]The SSL connection could not be established, see inner exception. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2035859221 From dholmes at openjdk.org Thu Apr 4 06:26:19 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 4 Apr 2024 06:26:19 GMT Subject: RFR: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() Message-ID: This is a small tidy up to try and remove confusion between checking `is_neutral` (a general state normally associated with a displaced markword in a "pristine" state) and `is_unlocked` (a specific state within the locking protocol). The underlying bit-pattern is the same and so these have been used somewhat synonymously/interchangeably. A few comment tweaks too. Testing: tiers 1-3 (sanity) Thanks. ------------- Commit messages: - Merge - 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() Changes: https://git.openjdk.org/jdk/pull/17741/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17741&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8325303 Stats: 26 lines in 5 files changed: 2 ins; 2 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/17741.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17741/head:pull/17741 PR: https://git.openjdk.org/jdk/pull/17741 From aboldtch at openjdk.org Thu Apr 4 06:57:03 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 4 Apr 2024 06:57:03 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: <5Mj_wuhYdBmtFIJAD0qrBMlrX1TmTzutO7hLN--mvec=.913fd656-7155-4744-ae8a-0d5266e76cca@github.com> On Thu, 4 Apr 2024 00:05:20 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Address comments > - Merge branch 'master' into 8329332 > - Removed not_used state of nmethod > - remove trailing whitespace > - 8329332: Remove CompiledMethod and CodeBlobLayout classes There is a stale comment in `test/jdk/com/sun/jdi/EATests.java:1288` -// (See CompiledMethod::is_at_poll_return()) +// (See nmethod::is_at_poll_return()) ------------- PR Review: https://git.openjdk.org/jdk/pull/18554#pullrequestreview-1978884423 From gcao at openjdk.org Thu Apr 4 07:48:22 2024 From: gcao at openjdk.org (Gui Cao) Date: Thu, 4 Apr 2024 07:48:22 GMT Subject: RFR: 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic Message-ID: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> Hi, I witnessed that some SHA-2 tests are skipped on RISC-V. The supportedCPUFeatures in IntrinsicPredicates.java is not correct for RISC-V, because it should depend on Zvkn extension instead of sha256/sha512. I tested this with QEMU system running linux-6.8 kernel. I used NR_riscv_hwprobe syscall to detect if the system supports the Zvkn extension. Because support for Zvkn extension is not fully tested on real hardwares, the code for detecting and enabling Zvkn extension is not included in this PR. The code for detecting Zvkn extension ``` diff diff --git a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp index df4a2e347cc..ef99acbf7c5 100644 --- a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp +++ b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp @@ -178,6 +178,13 @@ void RiscvHwprobe::add_features_from_query_result() { if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZFH)) { VM_Version::ext_Zfh.enable_feature(); } + if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNED) + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNHB) + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKB) + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKT)) { + VM_Version::ext_Zvkn.enable_feature(); + } if (is_valid(RISCV_HWPROBE_KEY_CPUPERF_0)) { VM_Version::unaligned_access.enable_feature( query[RISCV_HWPROBE_KEY_CPUPERF_0].value & RISCV_HWPROBE_MISALIGNED_MASK); This IntrinsicPredicates.java CPU matching change should only affect test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnSupportedCPU. java, test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA512IntrinsicsOptionOnSupportedCPU.java test case. Before this patch they are skipped, after this patch they can be selected and pass normally. We can test test/lib-test/jdk/test/whitebox/CPUInfoTest.java to see the actual CPU Features. ----------configuration:(0/0)---------- ----------System.out:(4/178)---------- WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" CPUInfo.getAdditionalCPUInfo(): "" CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] TEST PASSED ----------System.err:(2/88)---------- ### Testing - [ ] Run tier1-3, hotspot:tier4 tests on SOPHON SG2042 (release) - [ ] Run tier1-3 tests on ubuntu24(kernel version 6.8 and use qemu-system to boot ubuntu) (release) ------------- Commit messages: - 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic Changes: https://git.openjdk.org/jdk/pull/18611/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18611&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329641 Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18611.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18611/head:pull/18611 PR: https://git.openjdk.org/jdk/pull/18611 From stefank at openjdk.org Thu Apr 4 07:57:12 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 07:57:12 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: On Thu, 4 Apr 2024 00:05:20 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Address comments > - Merge branch 'master' into 8329332 > - Removed not_used state of nmethod > - remove trailing whitespace > - 8329332: Remove CompiledMethod and CodeBlobLayout classes I took a second pass over the changes. I've given a few suggestions below. None of them should require respinning of tests (except for making sure that this still builds). src/hotspot/share/code/codeBlob.hpp line 168: > 166: bool is_vtable_blob() const { return _kind == CodeBlobKind::Blob_Vtable; } > 167: bool is_method_handles_adapter_blob() const { return _kind == CodeBlobKind::Blob_MH_Adapter; } > 168: bool is_upcall_stub() const { return _kind == CodeBlobKind::Blob_Upcall; } The `Blob_` prefix is now redundant since we always have to prefix with CodeBlobKind::. Just a suggestion if you want to shorten these. src/hotspot/share/gc/shared/gcBehaviours.hpp line 31: > 29: #include "oops/oopsHierarchy.hpp" > 30: > 31: // This is the behaviour for checking if a nmethod is unloading Maybe this should be *an* nmethod? src/hotspot/share/gc/shenandoah/shenandoahUnload.cpp line 81: > 79: class ShenandoahIsUnloadingBehaviour : public IsUnloadingBehaviour { > 80: public: > 81: virtual bool has_dead_oop(nmethod* const nm) const { Is there a reason why this was changed to `nmethod* const nm` instead of `nmethod* nm`? IsUnloadingBehviour::has_dead_oop uses `nmethod* nm`. This question applies to the other changes in this file as well. src/hotspot/share/gc/x/xUnload.cpp line 78: > 76: class XIsUnloadingBehaviour : public IsUnloadingBehaviour { > 77: public: > 78: virtual bool has_dead_oop(nmethod* const nm) const { `nmethod* const nm` => `nmethod* nm`. (ZGC's style is to use const for local variables, but not for variables in the parameter list). The same applies to the rest of the changes to this file. src/hotspot/share/gc/z/zUnload.cpp line 77: > 75: class ZIsUnloadingBehaviour : public IsUnloadingBehaviour { > 76: public: > 77: virtual bool has_dead_oop(nmethod* const nm) const { `nmethod* const nm` => `nmethod* nm`. (ZGC's style is to use const for local variables, but not for variables in the parameter list). The same applies to the rest of the changes to this file. src/hotspot/share/runtime/javaThread.hpp line 123: > 121: DeoptResourceMark* _deopt_mark; // Holds special ResourceMark for deoptimization > 122: > 123: nmethod* _deopt_nmethod; // nmethod that is currently being deoptimized The alignment is (and was) weird here. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18554#pullrequestreview-1978954058 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551107567 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551073461 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551077362 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551080101 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551080280 PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551100400 From fyang at openjdk.org Thu Apr 4 08:14:09 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 4 Apr 2024 08:14:09 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 10:25:31 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch to update vm flags related to riscv profile? > Thanks > > Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. > These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. src/hotspot/cpu/riscv/globals_riscv.hpp line 104: > 102: product(bool, UseRVC, false, "Use RVC instructions") \ > 103: product(bool, UseRVA22U64, false, EXPERIMENTAL, "Use RVA22U64 profile") \ > 104: product(bool, UseRVA23U64, false, EXPERIMENTAL, "Use RVA23U64 profile") \ Can we group the three RV profile-related options together? src/hotspot/cpu/riscv/vm_version_riscv.hpp line 191: > 189: #define RV_USE_RVA22U64 \ > 190: RV_ENABLE_EXTENSION(UseRVC) \ > 191: RV_ENABLE_EXTENSION(UseRVV) \ Note that V was optional in RVA22U64. src/hotspot/cpu/riscv/vm_version_riscv.hpp line 208: > 206: RV_ENABLE_EXTENSION(UseRVC) \ > 207: RV_ENABLE_EXTENSION(UseRVV) \ > 208: RV_ENABLE_EXTENSION(UseZacas) \ Zacas is RVA23U64 Optional Extension. src/hotspot/cpu/riscv/vm_version_riscv.hpp line 219: > 217: RV_ENABLE_EXTENSION(UseZicboz) \ > 218: RV_ENABLE_EXTENSION(UseZihintpause) \ > 219: RV_ENABLE_EXTENSION(UseZvfh) \ Zvfh is RVA23U64 Optional Extension too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551137496 PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551130158 PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551134056 PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551136167 From fyang at openjdk.org Thu Apr 4 08:17:59 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 4 Apr 2024 08:17:59 GMT Subject: RFR: 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic In-Reply-To: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> References: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> Message-ID: On Thu, 4 Apr 2024 04:16:27 GMT, Gui Cao wrote: > Hi, I witnessed that some SHA-2 tests are skipped on RISC-V. The supportedCPUFeatures in IntrinsicPredicates.java is not correct for RISC-V, because it should depend on Zvkn extension instead of sha256/sha512. I tested this with QEMU system running linux-6.8 kernel. I used NR_riscv_hwprobe syscall to detect if the system supports the Zvkn extension. Because support for Zvkn extension is not fully tested on real hardwares, the code for detecting and enabling Zvkn extension is not included in this PR. > > The code for detecting Zvkn extension > ``` diff > diff --git a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > index df4a2e347cc..ef99acbf7c5 100644 > --- a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > +++ b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > @@ -178,6 +178,13 @@ void RiscvHwprobe::add_features_from_query_result() { > if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZFH)) { > VM_Version::ext_Zfh.enable_feature(); > } > + if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNED) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNHB) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKB) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKT)) { > + VM_Version::ext_Zvkn.enable_feature(); > + } > if (is_valid(RISCV_HWPROBE_KEY_CPUPERF_0)) { > VM_Version::unaligned_access.enable_feature( > query[RISCV_HWPROBE_KEY_CPUPERF_0].value & RISCV_HWPROBE_MISALIGNED_MASK); > > > This IntrinsicPredicates.java CPU matching change should only affect test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnSupportedCPU. java, test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA512IntrinsicsOptionOnSupportedCPU.java test case. Before this patch they are skipped, after this patch they can be selected and pass normally. > > We can test test/lib-test/jdk/test/whitebox/CPUInfoTest.java to see the actual CPU Features. > > ----------configuration:(0/0)---------- > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > ----------System.err:(2/88)---------- > > > ### Testing > - [ ] Run tier1-3, hotspot:tier4 tests on SOPHON SG2042 (release) > - [ ] Run tier1-3 tests on ubuntu24(kernel version 6.8 and use qemu-system to boot ubuntu) (release) LGTM. @robehn who worked on SHA-2 intrinsic on RISC-V might want to take a look. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18611#pullrequestreview-1979064606 From stefank at openjdk.org Thu Apr 4 08:28:59 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 08:28:59 GMT Subject: RFR: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() In-Reply-To: References: Message-ID: On Wed, 7 Feb 2024 01:52:00 GMT, David Holmes wrote: > This is a small tidy up to try and remove confusion between checking `is_neutral` (a general state normally associated with a displaced markword in a "pristine" state) and `is_unlocked` (a specific state within the locking protocol). The underlying bit-pattern is the same and so these have been used somewhat synonymously/interchangeably. > > A few comment tweaks too. > > Testing: tiers 1-3 (sanity) > > Thanks. This seems reasonable to me. Are there any use-cases of `is_neutral()` left? Could you explain why we use `is_neutral()` there and not `is_locked()`? ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17741#pullrequestreview-1979113931 From rehn at openjdk.org Thu Apr 4 08:55:09 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 4 Apr 2024 08:55:09 GMT Subject: RFR: 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic In-Reply-To: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> References: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> Message-ID: On Thu, 4 Apr 2024 04:16:27 GMT, Gui Cao wrote: > Hi, I witnessed that some SHA-2 tests are skipped on RISC-V. The supportedCPUFeatures in IntrinsicPredicates.java is not correct for RISC-V, because it should depend on Zvkn extension instead of sha256/sha512. I tested this with QEMU system running linux-6.8 kernel. I used NR_riscv_hwprobe syscall to detect if the system supports the Zvkn extension. Because support for Zvkn extension is not fully tested on real hardwares, the code for detecting and enabling Zvkn extension is not included in this PR. > > The code for detecting Zvkn extension > ``` diff > diff --git a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > index df4a2e347cc..ef99acbf7c5 100644 > --- a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > +++ b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > @@ -178,6 +178,13 @@ void RiscvHwprobe::add_features_from_query_result() { > if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZFH)) { > VM_Version::ext_Zfh.enable_feature(); > } > + if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNED) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNHB) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKB) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKT)) { > + VM_Version::ext_Zvkn.enable_feature(); > + } > if (is_valid(RISCV_HWPROBE_KEY_CPUPERF_0)) { > VM_Version::unaligned_access.enable_feature( > query[RISCV_HWPROBE_KEY_CPUPERF_0].value & RISCV_HWPROBE_MISALIGNED_MASK); > > > This IntrinsicPredicates.java CPU matching change should only affect test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnSupportedCPU. java, test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA512IntrinsicsOptionOnSupportedCPU.java test case. Before this patch they are skipped, after this patch they can be selected and pass normally. > > We can test test/lib-test/jdk/test/whitebox/CPUInfoTest.java to see the actual CPU Features. > > ----------configuration:(0/0)---------- > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > ----------System.err:(2/88)---------- > > > ### Testing > - [ ] Run tier1-3, hotspot:tier4 tests on SOPHON SG2042 (release) > - [ ] Run tier1-3 tests on ubuntu24(kernel version 6.8 and use qemu-system to boot ubuntu) (release) Marked as reviewed by rehn (Reviewer). Hey, good. Sorry I was wrong! Re-checked! Your code is actually correct, my bad! ------------- PR Review: https://git.openjdk.org/jdk/pull/18611#pullrequestreview-1979218025 PR Comment: https://git.openjdk.org/jdk/pull/18611#issuecomment-2036571909 From rcastanedalo at openjdk.org Thu Apr 4 08:56:19 2024 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 4 Apr 2024 08:56:19 GMT Subject: RFR: 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer Message-ID: This changeset updates an assert in G1's interpreter x86 post-barrier logic so that it refers to the right queue (`G1DirtyCardQueue` rather than pre-barrier's `SATBMarkQueue`) and moves the assert closer to the logic that exploits it. Thanks to Kim Barrett for reporting the issue and suggesting the fix. **Testing**: built on windows-x64, linux-x64, and macosx-x64. ------------- Commit messages: - Make comment more precise - Replace 'SATBMarkQueue' with 'G1DirtyCardQueue' and move assertion closer to the use Changes: https://git.openjdk.org/jdk/pull/18616/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18616&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329261 Stats: 5 lines in 1 file changed: 3 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18616.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18616/head:pull/18616 PR: https://git.openjdk.org/jdk/pull/18616 From mli at openjdk.org Thu Apr 4 09:05:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 4 Apr 2024 09:05:10 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv In-Reply-To: References: Message-ID: <0505WfkaVUmQATYeaBaXqBGoSd4O26wNBKNnftD-zZE=.dd340e3f-a4fa-4302-bf92-64c20c459dc5@github.com> On Thu, 4 Apr 2024 08:10:53 GMT, Fei Yang wrote: >> Hi, >> Can you help to review this patch to update vm flags related to riscv profile? >> Thanks >> >> Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. >> These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. > > src/hotspot/cpu/riscv/globals_riscv.hpp line 104: > >> 102: product(bool, UseRVC, false, "Use RVC instructions") \ >> 103: product(bool, UseRVA22U64, false, EXPERIMENTAL, "Use RVA22U64 profile") \ >> 104: product(bool, UseRVA23U64, false, EXPERIMENTAL, "Use RVA23U64 profile") \ > > Can we group the three RV profile-related options together? Sure, will do ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551279424 From mli at openjdk.org Thu Apr 4 09:08:09 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 4 Apr 2024 09:08:09 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 08:05:45 GMT, Fei Yang wrote: >> Hi, >> Can you help to review this patch to update vm flags related to riscv profile? >> Thanks >> >> Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. >> These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. > > src/hotspot/cpu/riscv/vm_version_riscv.hpp line 191: > >> 189: #define RV_USE_RVA22U64 \ >> 190: RV_ENABLE_EXTENSION(UseRVC) \ >> 191: RV_ENABLE_EXTENSION(UseRVV) \ > > Note that V was optional in RVA22U64. In fact, when I worked on it, I'm bit confused what it should be? including optional or not. Peviously, zfh is already enabled by the RVA22U64, but it's optional by [spec](https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#613-rva22u64-optional-extensions), so I followed this pattern. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551284845 From fyang at openjdk.org Thu Apr 4 09:20:01 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 4 Apr 2024 09:20:01 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 09:05:11 GMT, Hamlin Li wrote: >> src/hotspot/cpu/riscv/vm_version_riscv.hpp line 191: >> >>> 189: #define RV_USE_RVA22U64 \ >>> 190: RV_ENABLE_EXTENSION(UseRVC) \ >>> 191: RV_ENABLE_EXTENSION(UseRVV) \ >> >> Note that V was optional in RVA22U64. > > In fact, when I worked on it, I'm bit confused what it should be? including optional or not. > > Peviously, zfh is already enabled by the RVA22U64, but it's optional by [spec](https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#613-rva22u64-optional-extensions), so I followed this pattern. Yeah, I think that's a mistake. I don't think it's safe to assume availability of optional extensions. We should only assume mandatory extensions in each profile. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551303147 From aboldtch at openjdk.org Thu Apr 4 09:25:00 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 4 Apr 2024 09:25:00 GMT Subject: RFR: 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 08:36:37 GMT, Roberto Casta?eda Lozano wrote: > This changeset updates an assert in G1's interpreter x86 post-barrier logic so that it refers to the right queue (`G1DirtyCardQueue` rather than pre-barrier's `SATBMarkQueue`) and moves the assert closer to the logic that exploits it. > > Thanks to Kim Barrett for reporting the issue and suggesting the fix. > > **Testing**: built on windows-x64, linux-x64, and macosx-x64. Looks good. It is a little strange to me to use `pointer sized`, `sizeof(intptr_t)` and `wordSize` this interchangeably. Side note: The assert that used the correct queue type (in `generate_c1_pre_barrier_runtime_stub`) now have different styles w.r.t. having the static assert at the top of the function vs. next to the specific logic. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18616#pullrequestreview-1979288356 From rcastanedalo at openjdk.org Thu Apr 4 09:37:09 2024 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Thu, 4 Apr 2024 09:37:09 GMT Subject: RFR: 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 09:22:23 GMT, Axel Boldt-Christmas wrote: > Looks good. Thanks for reviewing, Axel! > It is a little strange to me to use `pointer sized`, `sizeof(intptr_t)` and `wordSize` this interchangeably. > > Side note: The assert that used the correct queue type (in `generate_c1_pre_barrier_runtime_stub`) now have different styles w.r.t. having the static assert at the top of the function vs. next to the specific logic. I agree, the whole `g1BarrierSetAssembler_x86.cpp` file would probably benefit from some refactoring to enforce consistency across different barrier implementations. But I think it would be best to postpone that cleanup to after [JDK-8322295](https://bugs.openjdk.org/browse/JDK-8322295) is implemented. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18616#issuecomment-2036679814 From mli at openjdk.org Thu Apr 4 09:47:59 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 4 Apr 2024 09:47:59 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 09:17:13 GMT, Fei Yang wrote: >> In fact, when I worked on it, I'm bit confused what it should be? including optional or not. >> >> Peviously, zfh is already enabled by the RVA22U64, but it's optional by [spec](https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#613-rva22u64-optional-extensions), so I followed this pattern. > > Yeah, I think that's a mistake. I don't think it's safe to assume availability of optional extensions. We should only assume mandatory extensions in each profile. Sure, I will correct it, thanks for discussion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18599#discussion_r1551350888 From stefank at openjdk.org Thu Apr 4 09:51:17 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 09:51:17 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal Message-ID: We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: // An aligned reference address (typically the least // address in the perm gen) used for hashing klass // objects. HeapWord* _ref; ... _ref = (HeapWord*) Universe::boolArrayKlassObj(); ... uint KlassInfoTable::hash(const Klass* p) { return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); } I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. ------------- Commit messages: - 8329655: Cleanup KlassObj and klassOop names after the PermGen removal Changes: https://git.openjdk.org/jdk/pull/18618/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18618&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329655 Stats: 125 lines in 29 files changed: 0 ins; 2 del; 123 mod Patch: https://git.openjdk.org/jdk/pull/18618.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18618/head:pull/18618 PR: https://git.openjdk.org/jdk/pull/18618 From rkennke at openjdk.org Thu Apr 4 10:12:10 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 Apr 2024 10:12:10 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal In-Reply-To: References: Message-ID: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> On Thu, 4 Apr 2024 09:45:58 GMT, Stefan Karlsson wrote: > We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. > > These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: > > // An aligned reference address (typically the least > // address in the perm gen) used for hashing klass > // objects. > HeapWord* _ref; > ... > _ref = (HeapWord*) Universe::boolArrayKlassObj(); > ... > uint KlassInfoTable::hash(const Klass* p) { > return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); > } > > > I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). > > Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. This is a useful change, it has tripped me up a couple of times, too. Change mostly looks good, just a few minor suggestions. src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1202: > 1200: ldrw(scan_temp, Address(recv_klass, Klass::vtable_length_offset())); > 1201: > 1202: // %%% Could store the aligned, prescaled offset in the klassoop. Unrelated, but what's the point of the %%% in all those comments? Might want to remove that, while you're there. src/hotspot/share/memory/heapInspection.cpp line 173: > 171: KlassInfoTable::KlassInfoTable(bool add_all_classes) { > 172: _size_of_instances_in_words = 0; > 173: _ref = (uintptr_t) Universe::boolArrayKlass(); It seems weird (non-obvious) to cast to uintptr_t here. I see it is only used in KlassInfoTable::hash(), which is weird, too. I am not sure that this even does a useful hashing. Might be worth to get rid of the whole thing and use the [fastHash](https://github.com/rkennke/jdk/blob/JDK-8305896/src/hotspot/share/utilities/fastHash.hpp) stuff that @rose00 proposed for Lilliput. Perhaps in a follow-up. I'd probably either cast to void* or Klass*, or cast to uintptr_t as you did and remove the unnecessary cast in ::hash(). ------------- Changes requested by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18618#pullrequestreview-1979371014 PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551366745 PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551384583 From gli at openjdk.org Thu Apr 4 10:37:26 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 4 Apr 2024 10:37:26 GMT Subject: RFR: 8329521: Serial: Rename MarkSweep to SerialFullGC Message-ID: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> Hi all, This patch renames the `MarkSweep` to `SerialFullGC` and fixes some comments related to `MarkSweep`. The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. Best Regards, -- Guoxiong ------------- Commit messages: - JDK-8329521 Changes: https://git.openjdk.org/jdk/pull/18619/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18619&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329521 Stats: 1631 lines in 12 files changed: 806 ins; 807 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/18619.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18619/head:pull/18619 PR: https://git.openjdk.org/jdk/pull/18619 From ayang at openjdk.org Thu Apr 4 11:01:09 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 4 Apr 2024 11:01:09 GMT Subject: RFR: 8328698: oopDesc::klass_raw() decodes without a null check In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 09:27:16 GMT, Stefan Karlsson wrote: > The oopDesc::klass_raw() function is used when the caller wants to skip asserts. Unfortunately, it skips the the check to see if the narrow klass is zero, which could lead to an incorrect Klass* being returned. This patch fixes this. > > In addition to this, I'm trying to make the code a bit clearer, so the patch also contains changes for the following: > > * The word raw has various different meaning in the context of oops and klasses. So, what does it mean in this context? Does it mean "read the klass pointer value without decoding it"? Or does it mean "decode the klass pointer value without any asserts"? I would like to propose that we use a name that describes that this function is used to skip performing various asserts. > > * I replaced the one usage of load_klass_raw with a call to klass_raw() instead. > > * I restructured the `is_oop_safe` so that we perform the null-check first. Note that `oopDesc::is_oop` performs its own verification of the klass pointer, so if we want extra klass verification in `is_oop_safe` we need to do it before calling the `is_oop` check. > > * I also renamed the _raw functions inside the CompressedKlassPointers klass and moved private functions. > > Tell me if you think some of these should be split up into separate RFEs. > > Tested with tier1-3. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18597#pullrequestreview-1979510778 From ayang at openjdk.org Thu Apr 4 11:05:12 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 4 Apr 2024 11:05:12 GMT Subject: RFR: 8329521: Serial: Rename MarkSweep to SerialFullGC In-Reply-To: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> References: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> Message-ID: <8H7hT2UWFKAMOaP9G3iuvA5SmsWWSAYJzbNjW3ajpUk=.99ea3d0f-3134-42b5-947a-d92bc76e738e@github.com> On Thu, 4 Apr 2024 10:32:43 GMT, Guoxiong Li wrote: > Hi all, > > This patch renames the `MarkSweep` to `SerialFullGC` and fixes some comments related to `MarkSweep`. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18619#pullrequestreview-1979519257 From iwalulya at openjdk.org Thu Apr 4 11:26:09 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 4 Apr 2024 11:26:09 GMT Subject: RFR: 8329521: Serial: Rename MarkSweep to SerialFullGC In-Reply-To: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> References: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> Message-ID: On Thu, 4 Apr 2024 10:32:43 GMT, Guoxiong Li wrote: > Hi all, > > This patch renames the `MarkSweep` to `SerialFullGC` and fixes some comments related to `MarkSweep`. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong LGTM! Minor nit. src/hotspot/share/gc/serial/tenuredGeneration.cpp line 28: > 26: #include "gc/serial/cardTableRS.hpp" > 27: #include "gc/serial/serialFullGC.hpp" > 28: #include "gc/serial/serialBlockOffsetTable.inline.hpp" Ordering of includes ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18619#pullrequestreview-1979565447 PR Review Comment: https://git.openjdk.org/jdk/pull/18619#discussion_r1551487819 From gli at openjdk.org Thu Apr 4 11:41:28 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 4 Apr 2024 11:41:28 GMT Subject: RFR: 8329521: Serial: Rename MarkSweep to SerialFullGC [v2] In-Reply-To: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> References: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> Message-ID: > Hi all, > > This patch renames the `MarkSweep` to `SerialFullGC` and fixes some comments related to `MarkSweep`. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: Fix order of included files. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18619/files - new: https://git.openjdk.org/jdk/pull/18619/files/29615768..350edb3b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18619&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18619&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18619.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18619/head:pull/18619 PR: https://git.openjdk.org/jdk/pull/18619 From gli at openjdk.org Thu Apr 4 11:41:28 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 4 Apr 2024 11:41:28 GMT Subject: RFR: 8329521: Serial: Rename MarkSweep to SerialFullGC [v2] In-Reply-To: References: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> Message-ID: On Thu, 4 Apr 2024 11:22:58 GMT, Ivan Walulya wrote: >> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix order of included files. > > src/hotspot/share/gc/serial/tenuredGeneration.cpp line 28: > >> 26: #include "gc/serial/cardTableRS.hpp" >> 27: #include "gc/serial/serialFullGC.hpp" >> 28: #include "gc/serial/serialBlockOffsetTable.inline.hpp" > > Ordering of includes Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18619#discussion_r1551507895 From sspitsyn at openjdk.org Thu Apr 4 11:47:59 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 4 Apr 2024 11:47:59 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake In-Reply-To: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: On Tue, 2 Apr 2024 00:22:28 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > > Ran mach5 tiers 1-6 Patricio, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18570#issuecomment-2036948300 From stefank at openjdk.org Thu Apr 4 12:11:10 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 12:11:10 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal In-Reply-To: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> References: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> Message-ID: On Thu, 4 Apr 2024 09:55:38 GMT, Roman Kennke wrote: >> We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. >> >> These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: >> >> // An aligned reference address (typically the least >> // address in the perm gen) used for hashing klass >> // objects. >> HeapWord* _ref; >> ... >> _ref = (HeapWord*) Universe::boolArrayKlassObj(); >> ... >> uint KlassInfoTable::hash(const Klass* p) { >> return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); >> } >> >> >> I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). >> >> Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1202: > >> 1200: ldrw(scan_temp, Address(recv_klass, Klass::vtable_length_offset())); >> 1201: >> 1202: // %%% Could store the aligned, prescaled offset in the klassoop. > > Unrelated, but what's the point of the %%% in all those comments? Might want to remove that, while you're there. I think it is an old-style TODO. I'm considering if we shouldn't just remove these comments. What do people think about that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551548890 From stefank at openjdk.org Thu Apr 4 12:18:24 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 12:18:24 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: Message-ID: > We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. > > These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: > > // An aligned reference address (typically the least > // address in the perm gen) used for hashing klass > // objects. > HeapWord* _ref; > ... > _ref = (HeapWord*) Universe::boolArrayKlassObj(); > ... > uint KlassInfoTable::hash(const Klass* p) { > return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); > } > > > I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). > > Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Review Roman ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18618/files - new: https://git.openjdk.org/jdk/pull/18618/files/02bcbd89..85f6bbe6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18618&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18618&range=00-01 Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18618.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18618/head:pull/18618 PR: https://git.openjdk.org/jdk/pull/18618 From rkennke at openjdk.org Thu Apr 4 12:18:24 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 Apr 2024 12:18:24 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> Message-ID: On Thu, 4 Apr 2024 12:08:23 GMT, Stefan Karlsson wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1202: >> >>> 1200: ldrw(scan_temp, Address(recv_klass, Klass::vtable_length_offset())); >>> 1201: >>> 1202: // %%% Could store the aligned, prescaled offset in the klassoop. >> >> Unrelated, but what's the point of the %%% in all those comments? Might want to remove that, while you're there. > > I think it is an old-style TODO. I'm considering if we shouldn't just remove these comments. What do people think about that? I'm not even sure what they want to say, really. Should be good to remove, and if anybody can make sense of it, record an issue in the bug-tracker? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551557034 From stefank at openjdk.org Thu Apr 4 12:18:25 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 12:18:25 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> References: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> Message-ID: On Thu, 4 Apr 2024 10:07:11 GMT, Roman Kennke wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review Roman > > src/hotspot/share/memory/heapInspection.cpp line 173: > >> 171: KlassInfoTable::KlassInfoTable(bool add_all_classes) { >> 172: _size_of_instances_in_words = 0; >> 173: _ref = (uintptr_t) Universe::boolArrayKlass(); > > It seems weird (non-obvious) to cast to uintptr_t here. I see it is only used in KlassInfoTable::hash(), which is weird, too. I am not sure that this even does a useful hashing. Might be worth to get rid of the whole thing and use the [fastHash](https://github.com/rkennke/jdk/blob/JDK-8305896/src/hotspot/share/utilities/fastHash.hpp) stuff that @rose00 proposed for Lilliput. Perhaps in a follow-up. I'd probably either cast to void* or Klass*, or cast to uintptr_t as you did and remove the unnecessary cast in ::hash(). I agree. I'll start by removing the redundant cast in `::hash()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551554904 From stefank at openjdk.org Thu Apr 4 12:22:12 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 12:22:12 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> Message-ID: On Thu, 4 Apr 2024 12:13:21 GMT, Roman Kennke wrote: >> I think it is an old-style TODO. I'm considering if we shouldn't just remove these comments. What do people think about that? > > I'm not even sure what they want to say, really. Should be good to remove, and if anybody can make sense of it, record an issue in the bug-tracker? OK. I removed the %%%. I'll wait a little bit to see if someone else wants to keep them for some reason, if not, I'll remove them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551568127 From mli at openjdk.org Thu Apr 4 12:37:00 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 4 Apr 2024 12:37:00 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Wed, 3 Apr 2024 19:23:01 GMT, Magnus Ihse Bursie wrote: > Just a quick question after giving this a glance: My understanding was that the normal libsleef build set a lot of compiler options, e.g. disabling built-in maths etc. You don't seem to set any of these. Have you determined that they were not needed? Thanks for having a look and quick response. Good question. Per `disabling built-in maths`, my understanding is that maybe we don't need to care about it, as this built-in math functions in compilers are only for scalar version, but we're using sleef's simd versions only which use vector intrinsics I think. e.g. in `src/libm/sleefdp.c` there is `ENABLE_BUILTIN_MATH` check, but in `src/libm/sleefsimdsp.c` there is no such check, so when generating inline header files, I assume its value (whether enable/disable built-in math) does not impact the generated simd functions. Please correct me if I'm understanding it wrongly. For other compiler options, I tend to agree with you, but I'm not sure which might need, can you supply more information or point to some reference about `normal libsleef build`? BTW, what I refered to before was from sleef.org and sleef on github (including its github workflow). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2037072600 From mbaesken at openjdk.org Thu Apr 4 12:39:16 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 4 Apr 2024 12:39:16 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events Message-ID: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. The mentioned memory protection operations related entries look like this : Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 ------------- Commit messages: - JDK-8329605 Changes: https://git.openjdk.org/jdk/pull/18626/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329605 Stats: 24 lines in 6 files changed: 16 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18626/head:pull/18626 PR: https://git.openjdk.org/jdk/pull/18626 From rkennke at openjdk.org Thu Apr 4 12:42:01 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 Apr 2024 12:42:01 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 12:18:24 GMT, Stefan Karlsson wrote: >> We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. >> >> These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: >> >> // An aligned reference address (typically the least >> // address in the perm gen) used for hashing klass >> // objects. >> HeapWord* _ref; >> ... >> _ref = (HeapWord*) Universe::boolArrayKlassObj(); >> ... >> uint KlassInfoTable::hash(const Klass* p) { >> return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); >> } >> >> >> I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). >> >> Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Review Roman Thanks! Looks good to me, now. Roman ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18618#pullrequestreview-1979767060 From stefank at openjdk.org Thu Apr 4 13:05:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 13:05:09 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events In-Reply-To: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: <6slsaND3GbbRLB78XSC2T8FcTEDpw3y3MQ8QZWRVYC8=.b1a36386-0aff-40d0-b1a5-7f8315122dfb@github.com> On Thu, 4 Apr 2024 12:34:19 GMT, Matthias Baesken wrote: > Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. > It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. > The mentioned memory protection operations related entries look like this : > Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 We still have flooding in the frequent events. Instead of creating a common section for these vent, did you consider creating two new separate, specific sections for memory protection and nmethod flushing? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2037157291 From mbaesken at openjdk.org Thu Apr 4 13:17:11 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 4 Apr 2024 13:17:11 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events In-Reply-To: <6slsaND3GbbRLB78XSC2T8FcTEDpw3y3MQ8QZWRVYC8=.b1a36386-0aff-40d0-b1a5-7f8315122dfb@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> <6slsaND3GbbRLB78XSC2T8FcTEDpw3y3MQ8QZWRVYC8=.b1a36386-0aff-40d0-b1a5-7f8315122dfb@github.com> Message-ID: On Thu, 4 Apr 2024 13:02:43 GMT, Stefan Karlsson wrote: > We still have flooding in the frequent events. Instead of creating a common section for these vent, did you consider creating two new separate, specific sections for memory protection and nmethod flushing? I thought about creating new specific sections. Regarding those 2, maybe they are a bit too specific ? But on the other hand, if others like this idea, I am fine with it (creating sections for memory protection operations and for nmethod flushing). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2037187521 From coleenp at openjdk.org Thu Apr 4 13:19:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Apr 2024 13:19:14 GMT Subject: RFR: 8313332: Simplify lazy jmethodID cache in InstanceKlass [v4] In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 00:07:34 GMT, Coleen Phillimore wrote: >> This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. >> >> Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Two more. I've run through tiers 5-7 with the patch now. Thank you for the reviews Alex, Serguei and Dan. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18549#issuecomment-2037185846 From coleenp at openjdk.org Thu Apr 4 13:19:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Apr 2024 13:19:14 GMT Subject: Integrated: 8313332: Simplify lazy jmethodID cache in InstanceKlass In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 15:25:48 GMT, Coleen Phillimore wrote: > This change simplifies the code that grows the jmethodID cache in InstanceKlass. Instead of lazily, when there's a rare request for a jmethodID for an obsolete method, the jmethodID cache is grown during the RedefineClasses safepoint. The InstanceKlass's jmethodID cache is lazily allocated when there's a jmethodID allocated, so not every InstanceKlass has a cache, but the growth now only happens in a safepoint. This code will become racy with the potential change for deallocating jmethodIDs. > > Tested with tier1-4, vmTestbase/nsk/jvmti java/lang/instrument tests (in case they're not in tier1-4). This pull request has now been integrated. Changeset: 21867c92 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/21867c929a2f2c961148f2cd1e79d672ac278d27 Stats: 229 lines in 7 files changed: 44 ins; 151 del; 34 mod 8313332: Simplify lazy jmethodID cache in InstanceKlass Reviewed-by: amenkov, sspitsyn, dcubed ------------- PR: https://git.openjdk.org/jdk/pull/18549 From coleenp at openjdk.org Thu Apr 4 13:34:03 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Apr 2024 13:34:03 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 12:18:24 GMT, Stefan Karlsson wrote: >> We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. >> >> These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: >> >> // An aligned reference address (typically the least >> // address in the perm gen) used for hashing klass >> // objects. >> HeapWord* _ref; >> ... >> _ref = (HeapWord*) Universe::boolArrayKlassObj(); >> ... >> uint KlassInfoTable::hash(const Klass* p) { >> return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); >> } >> >> >> I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). >> >> Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Review Roman This is good. The Obj was confusing. src/hotspot/share/memory/heapInspection.hpp line 111: > 109: > 110: // An aligned reference address (typically the least > 111: // address in the perm gen) used for hashing klass Rats I missed this. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18618#pullrequestreview-1979901207 PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551690175 From coleenp at openjdk.org Thu Apr 4 13:34:03 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 4 Apr 2024 13:34:03 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: <3E71V_pKhViBwx-i4vc5T6hk8DZnTOImwI4h5fQzgfE=.8e8e4a69-4f56-463b-b508-bdd5672862a6@github.com> Message-ID: On Thu, 4 Apr 2024 12:19:03 GMT, Stefan Karlsson wrote: >> I'm not even sure what they want to say, really. Should be good to remove, and if anybody can make sense of it, record an issue in the bug-tracker? > > OK. I removed the %%%. I'll wait a little bit to see if someone else wants to keep them for some reason, if not, I'll remove them. I think leaving these comments without the %%% seems fine. Describing this idea in a CR is a lot more difficult than seeing it in context as commentary, and unless the enhancement has other motivation, it won't be picked up. Leaving the comment as a clue seems useful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18618#discussion_r1551686232 From mli at openjdk.org Thu Apr 4 13:53:22 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 4 Apr 2024 13:53:22 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv [v2] In-Reply-To: References: Message-ID: <1zLJ4ekqbB9t_8o4SvCuEsHqpeF2oa0I9v1PCEs1bow=.33fe1276-eb19-4a41-a555-eef6369d4144@github.com> > Hi, > Can you help to review this patch to update vm flags related to riscv profile? > Thanks > > Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. > These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: remove optional extensions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18599/files - new: https://git.openjdk.org/jdk/pull/18599/files/d55e7644..bd2481ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18599&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18599&range=00-01 Stats: 8 lines in 2 files changed: 2 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18599.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18599/head:pull/18599 PR: https://git.openjdk.org/jdk/pull/18599 From sspitsyn at openjdk.org Thu Apr 4 15:33:32 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 4 Apr 2024 15:33:32 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake Message-ID: The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. Testing: - Ran mach5 tiers 1-6 ------------- Commit messages: - 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake Changes: https://git.openjdk.org/jdk/pull/18630/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329674 Stats: 108 lines in 2 files changed: 35 ins; 67 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18630.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18630/head:pull/18630 PR: https://git.openjdk.org/jdk/pull/18630 From kbarrett at openjdk.org Thu Apr 4 15:41:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 Apr 2024 15:41:09 GMT Subject: RFR: 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 08:36:37 GMT, Roberto Casta?eda Lozano wrote: > This changeset updates an assert in G1's interpreter x86 post-barrier logic so that it refers to the right queue (`G1DirtyCardQueue` rather than pre-barrier's `SATBMarkQueue`) and moves the assert closer to the logic that exploits it. > > Thanks to Kim Barrett for reporting the issue and suggesting the fix. > > **Testing**: built on windows-x64, linux-x64, and macosx-x64. Looks good. > > Looks good. > > Thanks for reviewing, Axel! > > I agree, the whole `g1BarrierSetAssembler_x86.cpp` file would probably benefit from some refactoring to enforce consistency across different barrier implementations. But I think it would be best to postpone that cleanup to after [JDK-8322295](https://bugs.openjdk.org/browse/JDK-8322295) is implemented. Given that I'm actively prototyping another change on top of the late barrier work, I'd really appreciate not having the g1BarrierSetAssembler files unnecessarily refactored any more for now. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18616#pullrequestreview-1980330434 PR Comment: https://git.openjdk.org/jdk/pull/18616#issuecomment-2037546689 From kvn at openjdk.org Thu Apr 4 15:59:12 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 15:59:12 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: On Thu, 4 Apr 2024 07:26:21 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Address comments >> - Merge branch 'master' into 8329332 >> - Removed not_used state of nmethod >> - remove trailing whitespace >> - 8329332: Remove CompiledMethod and CodeBlobLayout classes > > src/hotspot/share/gc/shared/gcBehaviours.hpp line 31: > >> 29: #include "oops/oopsHierarchy.hpp" >> 30: >> 31: // This is the behaviour for checking if a nmethod is unloading > > Maybe this should be *an* nmethod? Quote: "an" goes before words that begin with vowels. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551989120 From kvn at openjdk.org Thu Apr 4 16:02:12 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 16:02:12 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: On Thu, 4 Apr 2024 07:31:24 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Address comments >> - Merge branch 'master' into 8329332 >> - Removed not_used state of nmethod >> - remove trailing whitespace >> - 8329332: Remove CompiledMethod and CodeBlobLayout classes > > src/hotspot/share/gc/x/xUnload.cpp line 78: > >> 76: class XIsUnloadingBehaviour : public IsUnloadingBehaviour { >> 77: public: >> 78: virtual bool has_dead_oop(nmethod* const nm) const { > > `nmethod* const nm` => `nmethod* nm`. (ZGC's style is to use const for local variables, but not for variables in the parameter list). The same applies to the rest of the changes to this file. Okay. I did not know that it is only used for locals. I will update code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1551997411 From stefank at openjdk.org Thu Apr 4 16:06:14 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 16:06:14 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: On Thu, 4 Apr 2024 15:56:34 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/gc/shared/gcBehaviours.hpp line 31: >> >>> 29: #include "oops/oopsHierarchy.hpp" >>> 30: >>> 31: // This is the behaviour for checking if a nmethod is unloading >> >> Maybe this should be *an* nmethod? > > Quote: "an" goes before words that begin with vowels. I don't think that holds if the 'n' is pronounced the way nmethod is pronounced. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1552005187 From kvn at openjdk.org Thu Apr 4 16:09:02 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 16:09:02 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: On Thu, 4 Apr 2024 07:51:47 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Address comments >> - Merge branch 'master' into 8329332 >> - Removed not_used state of nmethod >> - remove trailing whitespace >> - 8329332: Remove CompiledMethod and CodeBlobLayout classes > > src/hotspot/share/code/codeBlob.hpp line 168: > >> 166: bool is_vtable_blob() const { return _kind == CodeBlobKind::Blob_Vtable; } >> 167: bool is_method_handles_adapter_blob() const { return _kind == CodeBlobKind::Blob_MH_Adapter; } >> 168: bool is_upcall_stub() const { return _kind == CodeBlobKind::Blob_Upcall; } > > The `Blob_` prefix is now redundant since we always have to prefix with CodeBlobKind::. Just a suggestion if you want to shorten these. Good suggestion ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1552009581 From kvn at openjdk.org Thu Apr 4 16:20:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 16:20:10 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: <9Lk2-DK1nYNPIyXGbhqsr2DfsaR8mQsD9qEevogrW-U=.036ec57b-5fd4-4711-a781-6139f58d419f@github.com> On Thu, 4 Apr 2024 16:03:12 GMT, Stefan Karlsson wrote: >> Quote: "an" goes before words that begin with vowels. > > I don't think that holds if the 'n' is pronounced the way nmethod is pronounced. `grep` shows that we have both cases but `an nmethod` is used more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1552024067 From kvn at openjdk.org Thu Apr 4 16:20:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 16:20:10 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: <9Lk2-DK1nYNPIyXGbhqsr2DfsaR8mQsD9qEevogrW-U=.036ec57b-5fd4-4711-a781-6139f58d419f@github.com> References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> <9Lk2-DK1nYNPIyXGbhqsr2DfsaR8mQsD9qEevogrW-U=.036ec57b-5fd4-4711-a781-6139f58d419f@github.com> Message-ID: On Thu, 4 Apr 2024 16:16:41 GMT, Vladimir Kozlov wrote: >> I don't think that holds if the 'n' is pronounced the way nmethod is pronounced. > > `grep` shows that we have both cases but `an nmethod` is used more. I will fix it here as you suggested but I am not touching other places. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18554#discussion_r1552025626 From kbarrett at openjdk.org Thu Apr 4 16:48:12 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 Apr 2024 16:48:12 GMT Subject: RFR: 8329546: Assume sized integral types are available In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 20:32:51 GMT, Ioi Lam wrote: >> Please review this change that cleans up the inclusion of and >> when using gcc/clang as the compiler. >> >> Testing: mach5 tier1 > > Looks reasonable. Thanks for reviews @iklam and @TheShermanTanker . ------------- PR Comment: https://git.openjdk.org/jdk/pull/18586#issuecomment-2037698971 From kbarrett at openjdk.org Thu Apr 4 16:48:12 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 Apr 2024 16:48:12 GMT Subject: Integrated: 8329546: Assume sized integral types are available In-Reply-To: References: Message-ID: <2sgfVHxxRO4CqwS8UZ1rL7t6hNWbDGQ23vh7X27pgs0=.9e19d1fd-171c-4166-8417-f2e277950330@github.com> On Tue, 2 Apr 2024 20:09:51 GMT, Kim Barrett wrote: > Please review this change that cleans up the inclusion of and > when using gcc/clang as the compiler. > > Testing: mach5 tier1 This pull request has now been integrated. Changeset: d90e5b5b Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/d90e5b5b9f235cfcfc635d107e8d73cd2ce35f51 Stats: 26 lines in 1 file changed: 2 ins; 23 del; 1 mod 8329546: Assume sized integral types are available Reviewed-by: iklam, jwaters ------------- PR: https://git.openjdk.org/jdk/pull/18586 From ihse at openjdk.org Thu Apr 4 16:50:08 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 4 Apr 2024 16:50:08 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Wed, 3 Apr 2024 14:40:42 GMT, Hamlin Li wrote: > Hi, > Can you help to review the patch? > This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). > > Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. > Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. > > Besides of the code changes, one important task is to handle the legal process. > > Thanks! Build libsleef using their cmake system and look at the compile command line. (You do this by `VERBOSE=1 cmake` IIRC). Then you can see what flags they are using. This is what I was referring to as "normal libsleef build". I noticed there were a lot of compiler flags. I can't say if they are needed or not. In most cases, if it compilers, it's fine, but in this case, I guess some flags can be crucial to really get the kind of performance you need, and it might not be easy to spot that something is wrong if you get them incorrect. I assume one way to make sure is to run microbenchmarks with an externally built libsleef and compare it with the one you build within the JDK. If there is no noticeable difference, then I guess it is fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2037703774 From gli at openjdk.org Thu Apr 4 16:59:21 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 4 Apr 2024 16:59:21 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable Message-ID: Hi all, This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. Best Regards, -- Guoxiong ------------- Commit messages: - JDK-8329603 Changes: https://git.openjdk.org/jdk/pull/18634/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329603 Stats: 165 lines in 10 files changed: 31 ins; 56 del; 78 mod Patch: https://git.openjdk.org/jdk/pull/18634.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18634/head:pull/18634 PR: https://git.openjdk.org/jdk/pull/18634 From kvn at openjdk.org Thu Apr 4 17:04:30 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 17:04:30 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v4] In-Reply-To: References: Message-ID: <2t_Et7WG-YB8Jvu9c3JIByOUM59BUo3DhSORCYFBZbY=.f808def0-15f1-4caa-aa3d-2b9b998b459f@github.com> > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Address next round of comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18554/files - new: https://git.openjdk.org/jdk/pull/18554/files/33768fb2..0c18ff17 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=02-03 Stats: 80 lines in 13 files changed: 0 ins; 0 del; 80 mod Patch: https://git.openjdk.org/jdk/pull/18554.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18554/head:pull/18554 PR: https://git.openjdk.org/jdk/pull/18554 From kvn at openjdk.org Thu Apr 4 17:04:31 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 17:04:31 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v3] In-Reply-To: References: <2sg6I-HBI12rc2LoWYX-A1S5vfMfDyj_5xoykANrZ8g=.6d0e5daa-30e4-45df-990e-c45b63477182@github.com> Message-ID: <3IuWX7uxBB3NeXD5nJiBZbh7jr4cxE6pkPr0TXemwag=.b4ad78d3-02eb-435e-a92e-13662779618e@github.com> On Thu, 4 Apr 2024 07:54:16 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Address comments >> - Merge branch 'master' into 8329332 >> - Removed not_used state of nmethod >> - remove trailing whitespace >> - 8329332: Remove CompiledMethod and CodeBlobLayout classes > > I took a second pass over the changes. I've given a few suggestions below. None of them should require respinning of tests (except for making sure that this still builds). Thank you, @stefank , for second round of review. I addressed all your comments. I also removed `virtual` from `virtual method() override;` as you suggested before since I touched these files again. I will wait result of new GHA and tier1 in mach5 before integration. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2037731476 From stefank at openjdk.org Thu Apr 4 17:34:10 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 4 Apr 2024 17:34:10 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v4] In-Reply-To: <2t_Et7WG-YB8Jvu9c3JIByOUM59BUo3DhSORCYFBZbY=.f808def0-15f1-4caa-aa3d-2b9b998b459f@github.com> References: <2t_Et7WG-YB8Jvu9c3JIByOUM59BUo3DhSORCYFBZbY=.f808def0-15f1-4caa-aa3d-2b9b998b459f@github.com> Message-ID: On Thu, 4 Apr 2024 17:04:30 GMT, Vladimir Kozlov wrote: >> Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. >> >> I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. >> >> I do not see (and not expected) performance difference with these changes. >> >> Tested tier1-5, xcomp, stress. Running performance testing. >> >> I need help with testing on platforms which Oracle does not support. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address next round of comments Looks good. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18554#pullrequestreview-1980693606 From ayang at openjdk.org Thu Apr 4 17:37:09 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 4 Apr 2024 17:37:09 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:55:49 GMT, Guoxiong Li wrote: > Hi all, > > This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 56: > 54: > 55: // The region that owns this BOT. > 56: HeapRegion* _hr; Is it possible to have a single bot in `G1CollectedHeap` and every region has a pointer to that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1552135704 From sgibbons at openjdk.org Thu Apr 4 19:06:10 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 4 Apr 2024 19:06:10 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: <4ECS4yQ0YXQVSt352CQhkQ4dax4VBYv6ZXzK9eBIio0=.baabfef1-2b9e-4b2f-ba0f-e358f6a83af1@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <4ECS4yQ0YXQVSt352CQhkQ4dax4VBYv6ZXzK9eBIio0=.baabfef1-2b9e-4b2f-ba0f-e358f6a83af1@github.com> Message-ID: On Tue, 2 Apr 2024 08:09:44 GMT, Doug Simon wrote: > Wouldn't it be better to do this intrinsification directly in the JIT without calling out to a stub? I believe the code size is too large for a direct JIT intrinsic. A lot of registers are also used, which may be an issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2037992868 From sgibbons at openjdk.org Thu Apr 4 19:06:11 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 4 Apr 2024 19:06:11 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 2 Apr 2024 02:16:07 GMT, David Holmes wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Use non-sse fill (old left in) > > This looks like it is still a Draft/work-in-progress. There is only code for x64 and it doesn't appear it will build on other platforms. Also there are still a bunch of `if 0` in the code that should not be there. @dholmes-ora Sorry for the dead code left in. It is gone now. Plus, this was only requested for x86, thus no implementation for other platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2037994898 From kvn at openjdk.org Thu Apr 4 19:26:11 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 19:26:11 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes [v4] In-Reply-To: References: <2t_Et7WG-YB8Jvu9c3JIByOUM59BUo3DhSORCYFBZbY=.f808def0-15f1-4caa-aa3d-2b9b998b459f@github.com> Message-ID: On Thu, 4 Apr 2024 17:31:53 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Address next round of comments > > Looks good. Thank you, @stefank, @iwanowww, @dean-long and @xmas92 for reviews and @RealFYang and @offamitkumar for testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18554#issuecomment-2038043339 From pchilanomate at openjdk.org Thu Apr 4 19:49:30 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 4 Apr 2024 19:49:30 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark Message-ID: There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/18632/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18632&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329665 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18632.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18632/head:pull/18632 PR: https://git.openjdk.org/jdk/pull/18632 From kvn at openjdk.org Thu Apr 4 19:52:16 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 4 Apr 2024 19:52:16 GMT Subject: Integrated: 8329332: Remove CompiledMethod and CodeBlobLayout classes In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 19:35:45 GMT, Vladimir Kozlov wrote: > Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. > > I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. > > I do not see (and not expected) performance difference with these changes. > > Tested tier1-5, xcomp, stress. Running performance testing. > > I need help with testing on platforms which Oracle does not support. This pull request has now been integrated. Changeset: 83eba863 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/83eba863fec5ee7e30c4f9b11122ad1deed3d2ec Stats: 3941 lines in 119 files changed: 1287 ins; 1753 del; 901 mod 8329332: Remove CompiledMethod and CodeBlobLayout classes Reviewed-by: vlivanov, stefank ------------- PR: https://git.openjdk.org/jdk/pull/18554 From dcubed at openjdk.org Thu Apr 4 20:51:09 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 4 Apr 2024 20:51:09 GMT Subject: RFR: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() In-Reply-To: References: Message-ID: On Wed, 7 Feb 2024 01:52:00 GMT, David Holmes wrote: > This is a small tidy up to try and remove confusion between checking `is_neutral` (a general state normally associated with a displaced markword in a "pristine" state) and `is_unlocked` (a specific state within the locking protocol). The underlying bit-pattern is the same and so these have been used somewhat synonymously/interchangeably. > > A few comment tweaks too. > > Testing: tiers 1-3 (sanity) > > Thanks. Thumbs up. I still found nine uses of `is_neutral()` in src/hotspot/share/runtime/synchronizer.cpp. I suspect you left these alone because they are all associated with displaced mark words, AKA: dmw/dmh/mark/temp/test. There are just too freaking names for the same concept of the "displaced mark word". I thought I had cleaned those up years ago, but obviously not everywhere or they changed again. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17741#pullrequestreview-1981124794 From pchilanomate at openjdk.org Thu Apr 4 21:10:22 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 4 Apr 2024 21:10:22 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames Message-ID: Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. I tested the current patch with the new test and also run it through mach5 tiers1-6. Thanks, Patricio ------------- Depends on: https://git.openjdk.org/jdk/pull/18632 Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/18637/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8325469 Stats: 243 lines in 16 files changed: 225 ins; 5 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18637/head:pull/18637 PR: https://git.openjdk.org/jdk/pull/18637 From mikael at openjdk.org Thu Apr 4 21:59:12 2024 From: mikael at openjdk.org (Mikael Vidstedt) Date: Thu, 4 Apr 2024 21:59:12 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Wed, 3 Apr 2024 14:40:42 GMT, Hamlin Li wrote: > Hi, > Can you help to review the patch? > This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). > > Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. > Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. > > Besides of the code changes, one important task is to handle the legal process. > > Thanks! make/modules/jdk.incubator.vector/Lib.gmk line 44: > 42: $(eval $(call SetupJdkLibrary, BUILD_LIBVECTORMATH, \ > 43: NAME := vectormath, \ > 44: CFLAGS := $(CFLAGS_JDKLIB) -Wno-error=unused-function, \ Should the unused-function be passed in using `DISABLE_WARNINGS_*` instead? src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8601: > 8599: } > 8600: } else { > 8601: log_info(library)("Failed to load native vector math library!"); Include the `ebuf` message? The corresponding x86_64 code could also use a log message for the error case. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18605#discussion_r1552502695 PR Review Comment: https://git.openjdk.org/jdk/pull/18605#discussion_r1552499482 From cslucas at openjdk.org Thu Apr 4 23:27:13 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 4 Apr 2024 23:27:13 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 16:12:26 GMT, Boris Ulasevich wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Fix AArch64 build & improve comment about InstructionMark >> - Catching up with changes in master >> - Catching up with origin/master >> - Catch up with origin/master >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 > > FYI. Something goes wrong with the change on ARM32. > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/ws/workspace/jdk-dev/label/linux-arm/type/b11/jdk/src/hotspot/share/asm/codeBuffer.hpp:163), pid=10782, tid=10796 > # assert(_mark != nullptr) failed: not an offset > # > # JRE version: OpenJDK Runtime Environment (23.0) (fastdebug build 23-commit8fc9097b-adhoc.re.jdk) > # Java VM: OpenJDK Server VM (fastdebug 23-commit8fc9097b-adhoc.re.jdk, mixed mode, g1 gc, linux-arm) > # Problematic frame: > # V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /ws/workspace/jdk-dev-jtreg/label/linux-arm/suite/jdk-tier1/type/t11/core.10782) > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # > > --------------- S U M M A R Y ------------ > > Command Line: > > Host: vm-ubuntu-16v4-aarch64-1, ARM, 4 cores, 7G, Ubuntu 16.04.7 LTS > Time: Wed Mar 27 07:16:41 2024 UTC elapsed time: 0.097440 seconds (0d 0h 0m 0s) > > --------------- T H R E A D --------------- > > Current thread (0xb120cd10): JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=10796, stack(0xb1090000,0xb1110000) (512K)] > > Stack: [0xb1090000,0xb1110000], sp=0xb110d200, free space=500k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac (codeBuffer.hpp:163) > V [libjvm.so+0x14ca28] CallRuntimeDirectNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const+0x80 > V [libjvm.so+0x10ed850] PhaseOutput::scratch_emit_size(Node const*)+0x37c > V [libjvm.so+0x10e5f64] PhaseOutput::shorten_branches(unsigned int*)+0x274 > V [libjvm.so+0x10f6dcc] PhaseOutput::Output()+0x488 > V [libjvm.so+0x699b54] Compile::Code_Gen()+0x424 > V [libjvm.so+0x69a87c] Compile::Compile(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool, DirectiveSet*)+0xb3c > V [libjvm.so+0x122e73c] OptoRuntime::generate_stub(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool)+0xb4 > V [libjvm.so+0x122ec68] OptoRuntime::generate(ciEnv*)+0x50 > V [libjvm.so+0x4994cc] C2Compiler::init_c2_runtime()+0x104 > V [libjvm.so+0x4996dc] C2Compiler::initialize()+0x9c > V [libjvm.so+... @bulasevich - Is the test that failed one of JDK jtreg tests? Did you include any additional JVM parameter to run the test? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2038435787 From dholmes at openjdk.org Fri Apr 5 01:12:13 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 Apr 2024 01:12:13 GMT Subject: RFR: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 08:26:44 GMT, Stefan Karlsson wrote: >> This is a small tidy up to try and remove confusion between checking `is_neutral` (a general state normally associated with a displaced markword in a "pristine" state) and `is_unlocked` (a specific state within the locking protocol). The underlying bit-pattern is the same and so these have been used somewhat synonymously/interchangeably. >> >> A few comment tweaks too. >> >> Testing: tiers 1-3 (sanity) >> >> Thanks. > > This seems reasonable to me. Are there any use-cases of `is_neutral()` left? Could you explain why we use `is_neutral()` there and not `is_locked()`? Thanks for the reviews @stefank and @dcubed-ojdk . > Are there any use-cases of is_neutral() left? Could you explain why we use is_neutral() there and not is_locked()? As Dan indicated (thanks Dan) yes there remain uses of `is_neutral` associated with inspection of the displaced markword. The displaced markword is (mostly) used when the associated object is locked, but the displaced markword itself contains the unlocked bit pattern. So I decided to keep the `is_neutral` terminology in those cases to avoid potential avoid confusion. As this doesn't seem to be a sticking point I will proceed with integration. Thanks again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17741#issuecomment-2038569301 From dholmes at openjdk.org Fri Apr 5 01:12:13 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 Apr 2024 01:12:13 GMT Subject: Integrated: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() In-Reply-To: References: Message-ID: <2y3duoF-jjuYzpiI4EsCv59K7mFqBduMjJY5CG-QWpo=.ebc4e64a-7364-4dfa-993e-92438db64e07@github.com> On Wed, 7 Feb 2024 01:52:00 GMT, David Holmes wrote: > This is a small tidy up to try and remove confusion between checking `is_neutral` (a general state normally associated with a displaced markword in a "pristine" state) and `is_unlocked` (a specific state within the locking protocol). The underlying bit-pattern is the same and so these have been used somewhat synonymously/interchangeably. > > A few comment tweaks too. > > Testing: tiers 1-3 (sanity) > > Thanks. This pull request has now been integrated. Changeset: 34f7974a Author: David Holmes URL: https://git.openjdk.org/jdk/commit/34f7974a40850f89b022a6254beab72f7811c85e Stats: 26 lines in 5 files changed: 2 ins; 2 del; 22 mod 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() Reviewed-by: stefank, dcubed ------------- PR: https://git.openjdk.org/jdk/pull/17741 From bulasevich at openjdk.org Fri Apr 5 02:11:13 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 5 Apr 2024 02:11:13 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 16:12:26 GMT, Boris Ulasevich wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Fix AArch64 build & improve comment about InstructionMark >> - Catching up with changes in master >> - Catching up with origin/master >> - Catch up with origin/master >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 > > FYI. Something goes wrong with the change on ARM32. > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/ws/workspace/jdk-dev/label/linux-arm/type/b11/jdk/src/hotspot/share/asm/codeBuffer.hpp:163), pid=10782, tid=10796 > # assert(_mark != nullptr) failed: not an offset > # > # JRE version: OpenJDK Runtime Environment (23.0) (fastdebug build 23-commit8fc9097b-adhoc.re.jdk) > # Java VM: OpenJDK Server VM (fastdebug 23-commit8fc9097b-adhoc.re.jdk, mixed mode, g1 gc, linux-arm) > # Problematic frame: > # V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /ws/workspace/jdk-dev-jtreg/label/linux-arm/suite/jdk-tier1/type/t11/core.10782) > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # > > --------------- S U M M A R Y ------------ > > Command Line: > > Host: vm-ubuntu-16v4-aarch64-1, ARM, 4 cores, 7G, Ubuntu 16.04.7 LTS > Time: Wed Mar 27 07:16:41 2024 UTC elapsed time: 0.097440 seconds (0d 0h 0m 0s) > > --------------- T H R E A D --------------- > > Current thread (0xb120cd10): JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=10796, stack(0xb1090000,0xb1110000) (512K)] > > Stack: [0xb1090000,0xb1110000], sp=0xb110d200, free space=500k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac (codeBuffer.hpp:163) > V [libjvm.so+0x14ca28] CallRuntimeDirectNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const+0x80 > V [libjvm.so+0x10ed850] PhaseOutput::scratch_emit_size(Node const*)+0x37c > V [libjvm.so+0x10e5f64] PhaseOutput::shorten_branches(unsigned int*)+0x274 > V [libjvm.so+0x10f6dcc] PhaseOutput::Output()+0x488 > V [libjvm.so+0x699b54] Compile::Code_Gen()+0x424 > V [libjvm.so+0x69a87c] Compile::Compile(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool, DirectiveSet*)+0xb3c > V [libjvm.so+0x122e73c] OptoRuntime::generate_stub(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool)+0xb4 > V [libjvm.so+0x122ec68] OptoRuntime::generate(ciEnv*)+0x50 > V [libjvm.so+0x4994cc] C2Compiler::init_c2_runtime()+0x104 > V [libjvm.so+0x4996dc] C2Compiler::initialize()+0x9c > V [libjvm.so+... > @bulasevich - Is the test that failed one of JDK jtreg tests? Did you include any additional JVM parameter to run the test? This happens on VM startup with empty params (no test). ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2038631560 From dlong at openjdk.org Fri Apr 5 02:43:04 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 5 Apr 2024 02:43:04 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v4] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Wed, 3 Apr 2024 15:15:24 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix Windows I think the right approach is to turn it into a loop in the IR, which I think is what Doug was implying. That way C2 can do all its usual optimizations, like unrolling, vectorization, and redundant store elimination (if it is an on-heap primitive array that was just allocated, then there is no need to zero the parts that are being "set"). ------------- Changes requested by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-1981533209 From amitkumar at openjdk.org Fri Apr 5 04:04:32 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 5 Apr 2024 04:04:32 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking Message-ID: s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 I had tested `tier1` on `fastdebug`, `release` vm. BenchMarking: ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 without patch: ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== with patch: ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== ------------- Commit messages: - 8310513: [s390x] Intrinsify recursive ObjectMonitor locking Changes: https://git.openjdk.org/jdk/pull/17975/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17975&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310513 Stats: 74 lines in 1 file changed: 28 ins; 14 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/17975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17975/head:pull/17975 PR: https://git.openjdk.org/jdk/pull/17975 From lucy at openjdk.org Fri Apr 5 04:04:32 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 5 Apr 2024 04:04:32 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: <2-LvRdoEDtvjFf1i2BXwPU5PHhuLyWW6R2ARACaOgmU=.96360670-507b-4681-9f84-fe00d484309e@github.com> On Fri, 23 Feb 2024 05:23:29 GMT, Amit Kumar wrote: > s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 > > I had tested `tier1` on `fastdebug`, `release` vm. > > BenchMarking: > > > ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 > > without patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== > > > > with patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== Changes requested by lucy (Reviewer). src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3280: > 3278: > 3279: // Current thread already owns the lock. Just increment recursion. > 3280: z_asi(Address(currentHeader, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions)), 1); This add will compromise your CC setting. Recursive locking was successful, so you need to maintain an "EQUAL" condition code. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3350: > 3348: // Recursive inflated unlock > 3349: z_asi(Address(currentHeader, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions)), -1); > 3350: z_cgr(currentHeader, currentHeader); // set the CC 1 Bad comment. This instruction sets the CC to 0b00 ("EQUAL"). That corresponds to a condition code mask value of 0x8. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3351: > 3349: z_asi(Address(currentHeader, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions)), -1); > 3350: z_cgr(currentHeader, currentHeader); // set the CC 1 > 3351: z_bre(done); For clarity, I would prefer to use z_bru(done); ------------- PR Review: https://git.openjdk.org/jdk/pull/17975#pullrequestreview-1900834942 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1502587029 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1502592243 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1502593386 From amitkumar at openjdk.org Fri Apr 5 04:04:32 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 5 Apr 2024 04:04:32 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 23 Feb 2024 05:23:29 GMT, Amit Kumar wrote: > s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 > > I had tested `tier1` on `fastdebug`, `release` vm. > > BenchMarking: > > > ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 > > without patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== > > > > with patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3207: > 3205: > 3206: if (DiagnoseSyncOnValueBasedClasses != 0) { > 3207: load_klass(Z_R1_scratch, oop); @RealLucy if we use `temp` here instead of Z_R1, do you think there will be issues ? It seems temp is free at this point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1513788397 From lucy at openjdk.org Fri Apr 5 04:04:32 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 5 Apr 2024 04:04:32 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 04:02:42 GMT, Amit Kumar wrote: >> s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 >> >> I had tested `tier1` on `fastdebug`, `release` vm. >> >> BenchMarking: >> >> >> ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 >> >> without patch: >> ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== >> >> >> >> with patch: >> ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== > > src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3207: > >> 3205: >> 3206: if (DiagnoseSyncOnValueBasedClasses != 0) { >> 3207: load_klass(Z_R1_scratch, oop); > > @RealLucy if we use `temp` here instead of Z_R1, do you think there will be issues ? It seems temp is free at this point. Using temp here actually is a good idea. As you know, using the scratch registers (Z_R0 and Z_R1) across calls is risky. You need to know exactly if they are used as scratch further down the call hierarchy. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1514804098 From amitkumar at openjdk.org Fri Apr 5 04:04:32 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 5 Apr 2024 04:04:32 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 16:34:26 GMT, Lutz Schmidt wrote: >> src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3207: >> >>> 3205: >>> 3206: if (DiagnoseSyncOnValueBasedClasses != 0) { >>> 3207: load_klass(Z_R1_scratch, oop); >> >> @RealLucy if we use `temp` here instead of Z_R1, do you think there will be issues ? It seems temp is free at this point. > > Using temp here actually is a good idea. As you know, using the scratch registers (Z_R0 and Z_R1) across calls is risky. You need to know exactly if they are used as scratch further down the call hierarchy. done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1515391375 From dholmes at openjdk.org Fri Apr 5 05:04:03 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 Apr 2024 05:04:03 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v2] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 2 Apr 2024 02:16:07 GMT, David Holmes wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Use non-sse fill (old left in) > > This looks like it is still a Draft/work-in-progress. There is only code for x64 and it doesn't appear it will build on other platforms. Also there are still a bunch of `if 0` in the code that should not be there. > @dholmes-ora Sorry for the dead code left in. It is gone now. Plus, this was only requested for x86, thus no implementation for other platforms. Only requested by whom? The JBS issue says nothing about that. I'm not even sure how this avoids the `CheckIntrinsics` check for missing intrinsics ... I guess it must only look for some kind of declaration not an actual implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2038941462 From dholmes at openjdk.org Fri Apr 5 05:26:11 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 5 Apr 2024 05:26:11 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:23:50 GMT, Patricio Chilano Mateo wrote: > There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. > > Thanks, > Patricio Okay. The memory management of that code "smells" a bit, but your fix addresses the observed issue. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18632#pullrequestreview-1981961395 From dlong at openjdk.org Fri Apr 5 05:49:10 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 5 Apr 2024 05:49:10 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v4] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Wed, 3 Apr 2024 15:15:24 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix Windows As an experiment, couldn't you have the C2 intrinsic redirect to a Java helper that calls putByte() in a loop? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2038994043 From eosterlund at openjdk.org Fri Apr 5 06:11:32 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 5 Apr 2024 06:11:32 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration Message-ID: When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation ------------- Commit messages: - 8329088: Stack chunk thawing races with concurrent GC stack iteration Changes: https://git.openjdk.org/jdk/pull/18643/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329088 Stats: 120 lines in 12 files changed: 40 ins; 29 del; 51 mod Patch: https://git.openjdk.org/jdk/pull/18643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18643/head:pull/18643 PR: https://git.openjdk.org/jdk/pull/18643 From stefank at openjdk.org Fri Apr 5 06:14:11 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 06:14:11 GMT Subject: RFR: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 08:26:44 GMT, Stefan Karlsson wrote: >> This is a small tidy up to try and remove confusion between checking `is_neutral` (a general state normally associated with a displaced markword in a "pristine" state) and `is_unlocked` (a specific state within the locking protocol). The underlying bit-pattern is the same and so these have been used somewhat synonymously/interchangeably. >> >> A few comment tweaks too. >> >> Testing: tiers 1-3 (sanity) >> >> Thanks. > > This seems reasonable to me. Are there any use-cases of `is_neutral()` left? Could you explain why we use `is_neutral()` there and not `is_locked()`? > Thanks for the reviews @stefank and @dcubed-ojdk . > > > Are there any use-cases of is_neutral() left? Could you explain why we use is_neutral() there and not is_locked()? > > As Dan indicated (thanks Dan) yes there remain uses of `is_neutral` associated with inspection of the displaced markword. The displaced markword is (mostly) used when the associated object is locked, but the displaced markword itself contains the unlocked bit pattern. So I decided to keep the `is_neutral` terminology in those cases to avoid potential avoid confusion. > > As this doesn't seem to be a sticking point I will proceed with integration. > > Thanks again. > > /integrate In BasicLock::move_to you renamed `is_neutral` to `is_locked` should that have stayed as `is_neutral`? - if (displaced_header().is_neutral()) { + if (displaced_header().is_unlocked()) { ------------- PR Comment: https://git.openjdk.org/jdk/pull/17741#issuecomment-2039021509 From stefank at openjdk.org Fri Apr 5 06:24:00 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 06:24:00 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 05:54:11 GMT, Erik ?sterlund wrote: > When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. > > At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. > > In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). > > Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation Looks good. There's a few nits that could be worth considering. src/hotspot/share/oops/stackChunkOop.inline.hpp line 115: > 113: if (is_empty()) { > 114: return 0; > 115: } else { Should this be removed? src/hotspot/share/runtime/continuationFreezeThaw.cpp line 652: > 650: const int chunk_start_sp = cont_size() + frame::metadata_words; > 651: > 652: chunk->set_max_thawing_size(cont_size()); Should this move be reverted? src/hotspot/share/runtime/continuationJavaClasses.hpp line 110: > 108: static inline void set_size(HeapWord* chunk, int value); > 109: > 110: static inline void set_bottom(HeapWord* chunk, int value); Shouldn't this be moved down to the other set_bottom? The two set_sp functions are held together. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-1982024387 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1552982540 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1552983809 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1552985724 From shade at openjdk.org Fri Apr 5 07:13:09 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 Apr 2024 07:13:09 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:23:50 GMT, Patricio Chilano Mateo wrote: > There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. > > Thanks, > Patricio TBH, seems rather odd to do this in debug mode only, and this far out. I see other places where we do `ResourceMark rm` near `OopMapCache::compute_one_oop_map`. Should we instead do: // process locals & expression stack ResourceMark rm; InterpreterOopMap mask; if (query_oop_map_cache) { m->mask_for(bci, &mask); } else { OopMapCache::compute_one_oop_map(m, bci, &mask); } mask.iterate_oop(&blk); ? ------------- PR Review: https://git.openjdk.org/jdk/pull/18632#pullrequestreview-1982146589 From gli at openjdk.org Fri Apr 5 07:14:33 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 07:14:33 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: > Hi all, > > This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: Use a simple/unified BOT. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18634/files - new: https://git.openjdk.org/jdk/pull/18634/files/e167a24f..ccafb2f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=00-01 Stats: 89 lines in 8 files changed: 10 ins; 41 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/18634.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18634/head:pull/18634 PR: https://git.openjdk.org/jdk/pull/18634 From gli at openjdk.org Fri Apr 5 07:20:59 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 07:20:59 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 17:34:08 GMT, Albert Mingkun Yang wrote: >> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a simple/unified BOT. > > src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 56: > >> 54: >> 55: // The region that owns this BOT. >> 56: HeapRegion* _hr; > > Is it possible to have a single bot in `G1CollectedHeap` and every region has a pointer to that? I marked the fields `_offset_base` and `_reserved` as `non-static`, removed the field `_hr` and shared a simple/unified BOT for all the heap regions. Since few methods need the heap region, I used the heap region as method parameter in such methods. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553054067 From rcastanedalo at openjdk.org Fri Apr 5 07:36:13 2024 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 5 Apr 2024 07:36:13 GMT Subject: RFR: 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 15:38:25 GMT, Kim Barrett wrote: > Looks good. Thanks for reviewing, Kim! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18616#issuecomment-2039144768 From rcastanedalo at openjdk.org Fri Apr 5 07:36:13 2024 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 5 Apr 2024 07:36:13 GMT Subject: Integrated: 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer In-Reply-To: References: Message-ID: <-qGlrs6aCKfiJu8j5HhsjQI28JEV7t8y97UPl105eFM=.1cc38ff7-8016-4bd9-8fbd-022b5e8d0171@github.com> On Thu, 4 Apr 2024 08:36:37 GMT, Roberto Casta?eda Lozano wrote: > This changeset updates an assert in G1's interpreter x86 post-barrier logic so that it refers to the right queue (`G1DirtyCardQueue` rather than pre-barrier's `SATBMarkQueue`) and moves the assert closer to the logic that exploits it. > > Thanks to Kim Barrett for reporting the issue and suggesting the fix. > > **Testing**: built on windows-x64, linux-x64, and macosx-x64. This pull request has now been integrated. Changeset: 1131bb77 Author: Roberto Casta?eda Lozano URL: https://git.openjdk.org/jdk/commit/1131bb77ec94dd131a10df4ba0f3fab32c65c0f2 Stats: 5 lines in 1 file changed: 3 ins; 2 del; 0 mod 8329261: G1: interpreter post-barrier x86 code asserts index size of wrong buffer Reviewed-by: aboldtch, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/18616 From stefank at openjdk.org Fri Apr 5 07:42:06 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 07:42:06 GMT Subject: RFR: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal [v2] In-Reply-To: References: Message-ID: <64fPE1tLpkAnzHk9HkEF7UoKAW5uejx4cpWsCiCPMDE=.c36728ad-f1d4-4ff3-875f-4abd07a200c0@github.com> On Thu, 4 Apr 2024 12:18:24 GMT, Stefan Karlsson wrote: >> We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. >> >> These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: >> >> // An aligned reference address (typically the least >> // address in the perm gen) used for hashing klass >> // objects. >> HeapWord* _ref; >> ... >> _ref = (HeapWord*) Universe::boolArrayKlassObj(); >> ... >> uint KlassInfoTable::hash(const Klass* p) { >> return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); >> } >> >> >> I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). >> >> Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Review Roman Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18618#issuecomment-2039156773 From stefank at openjdk.org Fri Apr 5 07:42:07 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 07:42:07 GMT Subject: Integrated: 8329655: Cleanup KlassObj and klassOop names after the PermGen removal In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 09:45:58 GMT, Stefan Karlsson wrote: > We have a few places that uses the terms `KlassObj` and `klassOop` when referring to Klasses. This is old code from before the PermGen removal, when Klasses also were Java objects. > > These names tripped me up when I was reading the heap heapInspection.cpp and first though we were mixing the klass *mirror* objects and klass pointers in the hash code calculation: > > // An aligned reference address (typically the least > // address in the perm gen) used for hashing klass > // objects. > HeapWord* _ref; > ... > _ref = (HeapWord*) Universe::boolArrayKlassObj(); > ... > uint KlassInfoTable::hash(const Klass* p) { > return (uint)(((uintptr_t)p - (uintptr_t)_ref) >> 2); > } > > > I propose that we rename these functions (and stop casting the Klass* to a (HeapWord*)). > > Tested with serviceability/dcmd/gc/ClassHistogramTest.java but will run this through our lower tiers. This pull request has now been integrated. Changeset: 71d48bcc Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/71d48bcc3d6313ab4bd031b5e50ae3a16338abc8 Stats: 126 lines in 29 files changed: 0 ins; 2 del; 124 mod 8329655: Cleanup KlassObj and klassOop names after the PermGen removal Reviewed-by: rkennke, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/18618 From ayang at openjdk.org Fri Apr 5 08:24:21 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 5 Apr 2024 08:24:21 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 07:14:33 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. >> >> The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Use a simple/unified BOT. src/hotspot/share/gc/g1/g1BlockOffsetTable.cpp line 166: > 164: // blk_start > 165: // > 166: void G1BlockOffsetTable::update_for_block_work(HeapWord* blk_start, Some indentation issues. src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 43: > 41: // start of the chunk that includes the first word of the subregion. > 42: // > 43: // Each G1BlockOffsetTable is owned by a HeapRegion. Need revision. src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 134: > 132: } > 133: > 134: void set_for_starts_humongous(HeapRegion* hr, HeapWord* obj_top, size_t fill_size); I feel this doesn't belong to BOT. Can probably be dealt with in another ticket. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553171210 PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553156558 PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553170536 From gli at openjdk.org Fri Apr 5 08:57:59 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 08:57:59 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 08:08:15 GMT, Albert Mingkun Yang wrote: >> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a simple/unified BOT. > > src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 43: > >> 41: // start of the chunk that includes the first word of the subregion. >> 42: // >> 43: // Each G1BlockOffsetTable is owned by a HeapRegion. > > Need revision. I think it is good to delete this line and the blank line. What do you think about it? > I feel this doesn't belong to BOT. Can probably be dealt with in another ticket. OK. What about the method `G1BlockOffsetTable::verify`? Is it good to be moved to `HeapRegion` and change the name as `verify_bot`, `verify_BOT` or `verify_block_offset_table`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553187744 PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553197736 From ayang at openjdk.org Fri Apr 5 09:11:59 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 5 Apr 2024 09:11:59 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 08:33:25 GMT, Guoxiong Li wrote: >> src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 43: >> >>> 41: // start of the chunk that includes the first word of the subregion. >>> 42: // >>> 43: // Each G1BlockOffsetTable is owned by a HeapRegion. >> >> Need revision. > > I think it is good to delete this line and the blank line. What do you think about it? Removing it is fine, IMO. >> src/hotspot/share/gc/g1/g1BlockOffsetTable.hpp line 134: >> >>> 132: } >>> 133: >>> 134: void set_for_starts_humongous(HeapRegion* hr, HeapWord* obj_top, size_t fill_size); >> >> I feel this doesn't belong to BOT. Can probably be dealt with in another ticket. > >> I feel this doesn't belong to BOT. Can probably be dealt with in another ticket. > > OK. > > What about the method `G1BlockOffsetTable::verify`? Is it good to be moved to `HeapRegion` and change the name as `verify_bot`, `verify_BOT` or `verify_block_offset_table`? That sounds reasonable. (Should not be done in this PR though.) (My experience with BOT is that they are almost never corrupted, so doing only checking-after-each-write is enough, sth like `ObjectStartArray::verify_for_block` -- there is possibly little value in verifying BOT in `HeapRegion::verify`. I wonder what others' opinions are.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553227172 PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553227054 From jbhateja at openjdk.org Fri Apr 5 09:20:10 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 5 Apr 2024 09:20:10 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 19:19:59 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s >> >> Performance, no intrinsic: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57... > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > remove use of jdk.crypto.ec Few early comments. Please update the copyright year of all the modified files. You can even consider splitting this into two patches, Java side changes in one and x86 optimized intrinsic in next one. src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 39: > 37: }; > 38: static address modulus_p256() { > 39: return (address)MODULUS_P256; Long constants should have UL suffix. src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 386: > 384: __ jcc(Assembler::equal, L_Length19); > 385: > 386: // Default copy loop Please add appropriate loop entry alignment. src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 394: > 392: __ lea(aLimbs, Address(aLimbs,8)); > 393: __ lea(bLimbs, Address(bLimbs,8)); > 394: __ jmp(L_DefaultLoop); Both sub and cmp are flag affecting instructions and are macro-fusible. By doing a loop rotation i.e. moving the length <= 0 check outside the loop and pushing the loop exit check at bottom you can save additional compare checks. ------------- PR Review: https://git.openjdk.org/jdk/pull/18583#pullrequestreview-1981555803 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1553056633 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1552710600 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1553110376 From eosterlund at openjdk.org Fri Apr 5 09:35:22 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 5 Apr 2024 09:35:22 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: > When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. > > At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. > > In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). > > Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Nits ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18643/files - new: https://git.openjdk.org/jdk/pull/18643/files/8b6cda97..40ea7943 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=00-01 Stats: 12 lines in 3 files changed: 3 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18643/head:pull/18643 PR: https://git.openjdk.org/jdk/pull/18643 From eosterlund at openjdk.org Fri Apr 5 09:35:23 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 5 Apr 2024 09:35:23 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 06:21:28 GMT, Stefan Karlsson wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Nits > > Looks good. There's a few nits that could be worth considering. Thanks for the review @stefank. I fixed the nits you found. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2039329384 From stefank at openjdk.org Fri Apr 5 09:51:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 09:51:09 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 09:35:22 GMT, Erik ?sterlund wrote: >> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. >> >> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. >> >> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >> >> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Nits Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-1982512749 From gli at openjdk.org Fri Apr 5 10:05:37 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 10:05:37 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: > Hi all, > > This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: - Remove unnecessary comments. - Fix indentation issue. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18634/files - new: https://git.openjdk.org/jdk/pull/18634/files/ccafb2f7..a8a121bf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=01-02 Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18634.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18634/head:pull/18634 PR: https://git.openjdk.org/jdk/pull/18634 From gli at openjdk.org Fri Apr 5 10:05:38 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 10:05:38 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 08:19:33 GMT, Albert Mingkun Yang wrote: >> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Use a simple/unified BOT. > > src/hotspot/share/gc/g1/g1BlockOffsetTable.cpp line 166: > >> 164: // blk_start >> 165: // >> 166: void G1BlockOffsetTable::update_for_block_work(HeapWord* blk_start, > > Some indentation issues. Fixed. I tried to find other indentation issues except this one, but can't find now. Please point it out if you found. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553309413 From gli at openjdk.org Fri Apr 5 10:05:38 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 10:05:38 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 09:09:10 GMT, Albert Mingkun Yang wrote: >> I think it is good to delete this line and the blank line. What do you think about it? > > Removing it is fine, IMO. Removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553310416 From ayang at openjdk.org Fri Apr 5 10:14:09 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 5 Apr 2024 10:14:09 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 10:05:37 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. >> >> The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: > > - Remove unnecessary comments. > - Fix indentation issue. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18634#pullrequestreview-1982586398 From tschatzl at openjdk.org Fri Apr 5 10:30:00 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 Apr 2024 10:30:00 GMT Subject: RFR: 8328698: oopDesc::klass_raw() decodes without a null check In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 09:27:16 GMT, Stefan Karlsson wrote: > The oopDesc::klass_raw() function is used when the caller wants to skip asserts. Unfortunately, it skips the the check to see if the narrow klass is zero, which could lead to an incorrect Klass* being returned. This patch fixes this. > > In addition to this, I'm trying to make the code a bit clearer, so the patch also contains changes for the following: > > * The word raw has various different meaning in the context of oops and klasses. So, what does it mean in this context? Does it mean "read the klass pointer value without decoding it"? Or does it mean "decode the klass pointer value without any asserts"? I would like to propose that we use a name that describes that this function is used to skip performing various asserts. > > * I replaced the one usage of load_klass_raw with a call to klass_raw() instead. > > * I restructured the `is_oop_safe` so that we perform the null-check first. Note that `oopDesc::is_oop` performs its own verification of the klass pointer, so if we want extra klass verification in `is_oop_safe` we need to do it before calling the `is_oop` check. > > * I also renamed the _raw functions inside the CompressedKlassPointers klass and moved private functions. > > Tell me if you think some of these should be split up into separate RFEs. > > Tested with tier1-3. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18597#pullrequestreview-1982644482 From gli at openjdk.org Fri Apr 5 11:18:06 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 11:18:06 GMT Subject: RFR: 8329521: Serial: Rename MarkSweep to SerialFullGC [v2] In-Reply-To: <8H7hT2UWFKAMOaP9G3iuvA5SmsWWSAYJzbNjW3ajpUk=.99ea3d0f-3134-42b5-947a-d92bc76e738e@github.com> References: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> <8H7hT2UWFKAMOaP9G3iuvA5SmsWWSAYJzbNjW3ajpUk=.99ea3d0f-3134-42b5-947a-d92bc76e738e@github.com> Message-ID: On Thu, 4 Apr 2024 11:02:20 GMT, Albert Mingkun Yang wrote: >> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix order of included files. > > Marked as reviewed by ayang (Reviewer). @albertnetymk @walulyai Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18619#issuecomment-2039536448 From gli at openjdk.org Fri Apr 5 11:18:07 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 11:18:07 GMT Subject: Integrated: 8329521: Serial: Rename MarkSweep to SerialFullGC In-Reply-To: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> References: <-wPqqvYOVnL3i0eltWuK9_x7WiGu8OmPCPtkz0Fm0h8=.08b61e62-d319-45cd-a752-d31005c23035@github.com> Message-ID: On Thu, 4 Apr 2024 10:32:43 GMT, Guoxiong Li wrote: > Hi all, > > This patch renames the `MarkSweep` to `SerialFullGC` and fixes some comments related to `MarkSweep`. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong This pull request has now been integrated. Changeset: 27353ad3 Author: Guoxiong Li URL: https://git.openjdk.org/jdk/commit/27353ad367c2342086d8e56ee2412d796d44b664 Stats: 1632 lines in 12 files changed: 807 ins; 808 del; 17 mod 8329521: Serial: Rename MarkSweep to SerialFullGC Reviewed-by: ayang, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/18619 From shade at openjdk.org Fri Apr 5 11:43:10 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 Apr 2024 11:43:10 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 09:35:22 GMT, Erik ?sterlund wrote: >> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. >> >> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. >> >> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >> >> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Nits This looks reasonable, I have a question, though: src/hotspot/share/oops/oop.inline.hpp line 242: > 240: inline void oopDesc::int_field_put(int offset, jint value) { *field_addr(offset) = value; } > 241: inline jint oopDesc::int_field_relaxed(int offset) const { return Atomic::load(field_addr(offset)); } > 242: inline void oopDesc::int_field_put_relaxed(int offset, jint value) { Atomic::store(field_addr(offset), value); } I have a stylistic question/suggestion. These are basically Java heap accessors, shouldn't they go through `HeapAccess::{load,store}`? This would also match the style already used in this file. ------------- PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-1982887846 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1553461670 From gli at openjdk.org Fri Apr 5 11:58:10 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 11:58:10 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v2] In-Reply-To: References: Message-ID: <3_MpQYk9aE1HnkuA7dU85sboQ1S5FoY1Ym0vjDbHFPs=.b49996e3-e1c1-4a8b-a62e-9f1800b0b0d1@github.com> On Fri, 5 Apr 2024 09:09:04 GMT, Albert Mingkun Yang wrote: >>> I feel this doesn't belong to BOT. Can probably be dealt with in another ticket. >> >> OK. >> >> What about the method `G1BlockOffsetTable::verify`? Is it good to be moved to `HeapRegion` and change the name as `verify_bot`, `verify_BOT` or `verify_block_offset_table`? > > That sounds reasonable. (Should not be done in this PR though.) > > (My experience with BOT is that they are almost never corrupted, so doing only checking-after-each-write is enough, sth like `ObjectStartArray::verify_for_block` -- there is possibly little value in verifying BOT in `HeapRegion::verify`. I wonder what others' opinions are.) > I feel this doesn't belong to BOT. Can probably be dealt with in another ticket. Filed https://bugs.openjdk.org/browse/JDK-8329767 to follow up. > (My experience with BOT is that they are almost never corrupted, so doing only checking-after-each-write is enough, sth like `ObjectStartArray::verify_for_block` -- there is possibly little value in verifying BOT in `HeapRegion::verify`. I wonder what others' opinions are.) The verifications of Serial and Parallel BOTs are similar. I agree that G1 should be adjusted. Filed https://bugs.openjdk.org/browse/JDK-8329771 to follow up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1553497806 From stefank at openjdk.org Fri Apr 5 12:01:17 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 12:01:17 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types Message-ID: We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? ------------- Commit messages: - 8329750: Change Universe functions to return more specific Klass* types Changes: https://git.openjdk.org/jdk/pull/18652/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18652&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329750 Stats: 43 lines in 6 files changed: 2 ins; 7 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/18652.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18652/head:pull/18652 PR: https://git.openjdk.org/jdk/pull/18652 From mli at openjdk.org Fri Apr 5 12:17:17 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 5 Apr 2024 12:17:17 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: > Hi, > Can you help to review the patch? > This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). > > Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. > Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. > > Besides of the code changes, one important task is to handle the legal process. > > Thanks! Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: - disable unused-function warnings; add log msg - minor ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18605/files - new: https://git.openjdk.org/jdk/pull/18605/files/3ab4795d..34529ff1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=00-01 Stats: 8 lines in 4 files changed: 5 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18605.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18605/head:pull/18605 PR: https://git.openjdk.org/jdk/pull/18605 From mli at openjdk.org Fri Apr 5 12:17:17 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 5 Apr 2024 12:17:17 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Thu, 4 Apr 2024 16:47:44 GMT, Magnus Ihse Bursie wrote: > Build libsleef using their cmake system and look at the compile command line. (You do this by `VERBOSE=1 cmake` IIRC). Then you can see what flags they are using. This is what I was referring to as "normal libsleef build". I noticed there were a lot of compiler flags. I can't say if they are needed or not. In most cases, if it compilers, it's fine, but in this case, I guess some flags can be crucial to really get the kind of performance you need, and it might not be easy to spot that something is wrong if you get them incorrect. I assume one way to make sure is to run microbenchmarks with an externally built libsleef and compare it with the one you build within the JDK. If there is no noticeable difference, then I guess it is fine. Thanks for the clarification and good suggestion. I will verify it and update here later. Just right now I have some trouble to get an aarch64 linux, I tried to get a graviton instance on AWS, but I failed to connect it when I create it. Previously I run all the test for correctness via qemu, but seems qemu is not for performance test. So I will update later when I get the environment ready. If someone got the easy environment to verify the performance, it's very welcome. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2039645463 From mli at openjdk.org Fri Apr 5 12:17:18 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 5 Apr 2024 12:17:18 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Thu, 4 Apr 2024 21:55:54 GMT, Mikael Vidstedt wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - disable unused-function warnings; add log msg >> - minor > > make/modules/jdk.incubator.vector/Lib.gmk line 44: > >> 42: $(eval $(call SetupJdkLibrary, BUILD_LIBVECTORMATH, \ >> 43: NAME := vectormath, \ >> 44: CFLAGS := $(CFLAGS_JDKLIB) -Wno-error=unused-function, \ > > Should the unused-function be passed in using `DISABLE_WARNINGS_*` instead? Thanks! Good suggestion, it makes the output clean. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18605#discussion_r1553519958 From stefank at openjdk.org Fri Apr 5 12:18:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 12:18:09 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 11:28:15 GMT, Aleksey Shipilev wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Nits > > src/hotspot/share/oops/oop.inline.hpp line 242: > >> 240: inline void oopDesc::int_field_put(int offset, jint value) { *field_addr(offset) = value; } >> 241: inline jint oopDesc::int_field_relaxed(int offset) const { return Atomic::load(field_addr(offset)); } >> 242: inline void oopDesc::int_field_put_relaxed(int offset, jint value) { Atomic::store(field_addr(offset), value); } > > I have a stylistic question/suggestion. These are basically Java heap accessors, shouldn't they go through `HeapAccess::{load,store}`? This would also match the style already used in this file. We don't use HeapAccess to access primitive values in objects. HeapAccess is only used when we access oops. We do use RawAccess in some of these functions though, but we do that because there's no support for MQ_SEQ_CST in the Atomic APIs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1553521971 From stefank at openjdk.org Fri Apr 5 12:37:38 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 5 Apr 2024 12:37:38 GMT Subject: RFR: 8329629: GC interfaces should work directly against nmethod instead of CodeBlob Message-ID: The GCs scan and handles nmethods and ignores CodeBlobs of other kinds. The I propose that we stop sending in CodeBlobs to the GCs and make sure to only give them nmethods. I removed `void CodeCache::blobs_do(CodeBlobClosure* f)` since there's no more usage of that function. Is this OK? I also opted to skipped calling the GC verification code from the iterator code: Universe::heap()->verify_nmethod((nmethod*)cb); IMHO, I think it is up to the GCs to decide if they want to perform extra nmethod verification. If someone wants to keep this verification in their favorite GC I can add calls to this function where we used to call CodeCache::blobs_do. I've only done limited testing and will run extensive testing concurrent with the review. ------------- Commit messages: - 8329629: GC interfaces should work directly against nmethod instead of CodeBlob Changes: https://git.openjdk.org/jdk/pull/18653/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18653&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329629 Stats: 850 lines in 74 files changed: 238 ins; 318 del; 294 mod Patch: https://git.openjdk.org/jdk/pull/18653.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18653/head:pull/18653 PR: https://git.openjdk.org/jdk/pull/18653 From gli at openjdk.org Fri Apr 5 12:49:10 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 5 Apr 2024 12:49:10 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: <08fBT8QlnekJwy3BkQWMGw4p2BXlNOdI4moWsg0AwNI=.ac3e939c-1e8f-494c-b5d9-293c4f32a95c@github.com> On Fri, 5 Apr 2024 10:11:18 GMT, Albert Mingkun Yang wrote: >> Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove unnecessary comments. >> - Fix indentation issue. > > Marked as reviewed by ayang (Reviewer). @albertnetymk Thanks for your review. Waiting for another review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18634#issuecomment-2039718360 From pchilanomate at openjdk.org Fri Apr 5 13:37:10 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 5 Apr 2024 13:37:10 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: <3C61cOAuQ9fo18HiRk_zP4F7-SCdZaZ7V_LCmesrZgg=.50e7b11b-e114-4847-9305-ed1a62187f34@github.com> On Fri, 5 Apr 2024 07:10:18 GMT, Aleksey Shipilev wrote: > TBH, seems rather odd to do this in debug mode only, and this far out. I see other places where we do `ResourceMark rm` near `OopMapCache::compute_one_oop_map`. Should we instead do: > > ``` > // process locals & expression stack > ResourceMark rm; > InterpreterOopMap mask; > if (query_oop_map_cache) { > m->mask_for(bci, &mask); > } else { > OopMapCache::compute_one_oop_map(m, bci, &mask); > } > mask.iterate_oop(&blk); > ``` > > ? > That was the other option, and to remove the manual freeing in ~InterpreterOopMap(). But it would add more overhead to this call and I didn't think it was worth it. But if we want to go that route I could run some set of benchmarks to make sure it doesn't change anything. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18632#issuecomment-2039820666 From coleenp at openjdk.org Fri Apr 5 13:43:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 Apr 2024 13:43:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types In-Reply-To: References: Message-ID: <0Cc1rLl044hyL-b1Lw4DjkHKrmf6d4ZXBADoR61PXyg=.e54d972f-b7d0-45d7-8957-3fcc3dc199ad@github.com> On Fri, 5 Apr 2024 11:56:11 GMT, Stefan Karlsson wrote: > We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? Yes, this looks really good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18652#pullrequestreview-1983216200 From jsjolen at openjdk.org Fri Apr 5 13:47:18 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 13:47:18 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v18] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with three additional commits since the last revision: - File not device - Rename to initialize - Fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/56de7bb9..414d0f17 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=16-17 Stats: 6 lines in 6 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 5 13:51:15 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 13:51:15 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v19] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Remove qualifier ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/414d0f17..e87cc076 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=17-18 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 5 13:57:17 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 13:57:17 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v20] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Clean constness from val() and VTreap usage ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/e87cc076..7ac1ae1e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=18-19 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 5 14:03:24 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 14:03:24 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v21] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Use CMP ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/7ac1ae1e..fb4b7d68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=19-20 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From coleenp at openjdk.org Fri Apr 5 14:12:17 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 Apr 2024 14:12:17 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code Message-ID: This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. Tested with tier1-4. ------------- Commit messages: - Add some timing to control oopstorage cleanup. - Remove SafepointCleanup tasks, events and logging. - Move OopStorage work to periodic service thread intervals. Changes: https://git.openjdk.org/jdk/pull/18375/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329488 Stats: 287 lines in 14 files changed: 10 ins; 234 del; 43 mod Patch: https://git.openjdk.org/jdk/pull/18375.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18375/head:pull/18375 PR: https://git.openjdk.org/jdk/pull/18375 From jsjolen at openjdk.org Fri Apr 5 14:32:18 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 14:32:18 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v22] In-Reply-To: References: Message-ID: <7OWRVF0Sc00dvXm61EzjIzVYRMsAaKUL9AT4wtLn0uw=.e5857bee-b212-4570-b49f-303aa2902ca2@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fix visit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/fb4b7d68..294320cf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=20-21 Stats: 8 lines in 1 file changed: 0 ins; 2 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 5 15:13:38 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 15:13:38 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v23] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Messed up the visit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/294320cf..1a3b8a22 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=21-22 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 5 15:26:13 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 5 Apr 2024 15:26:13 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v23] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 15:13:38 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Messed up the visit Right, the refactoring to remove the `friend` declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked. Last good hash: 7445999ee296872320f91146e1004026ba1133c7 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2040080364 From shade at openjdk.org Fri Apr 5 15:27:08 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 Apr 2024 15:27:08 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: <3C61cOAuQ9fo18HiRk_zP4F7-SCdZaZ7V_LCmesrZgg=.50e7b11b-e114-4847-9305-ed1a62187f34@github.com> References: <3C61cOAuQ9fo18HiRk_zP4F7-SCdZaZ7V_LCmesrZgg=.50e7b11b-e114-4847-9305-ed1a62187f34@github.com> Message-ID: On Fri, 5 Apr 2024 13:34:07 GMT, Patricio Chilano Mateo wrote: > That was the other option, and to remove the manual freeing in ~InterpreterOopMap(). But it would add more overhead to this call and I didn't think it was worth it. But if we want to go that route I could run some set of benchmarks to make sure it doesn't change anything. Yeah, it bothers me adding RMs for debug paths only: it would silently work in debug builds if there are new resource allocations in downstream code, and release builds would silently break. Placing debug-only RM at the beginning of large method only makes it worse. The good reason for debug-only-ing RMs is when we need the assert/logs, but even that RM scope is usually very clear, and ends after the relevant debugging/assert thing is done. Here, not so much. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18632#issuecomment-2040080092 From eosterlund at openjdk.org Fri Apr 5 16:57:09 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 5 Apr 2024 16:57:09 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: <5vLKCX39nHHzlMtvuGAmGMhPLU4SLObNcfMdiEBy3sE=.fff1c12b-2af6-46d6-a078-135c64e6bb48@github.com> On Fri, 5 Apr 2024 11:40:30 GMT, Aleksey Shipilev wrote: > This looks reasonable, I have a question, though: Thanks for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2040258939 From eosterlund at openjdk.org Fri Apr 5 16:57:10 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 5 Apr 2024 16:57:10 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 12:15:01 GMT, Stefan Karlsson wrote: >> src/hotspot/share/oops/oop.inline.hpp line 242: >> >>> 240: inline void oopDesc::int_field_put(int offset, jint value) { *field_addr(offset) = value; } >>> 241: inline jint oopDesc::int_field_relaxed(int offset) const { return Atomic::load(field_addr(offset)); } >>> 242: inline void oopDesc::int_field_put_relaxed(int offset, jint value) { Atomic::store(field_addr(offset), value); } >> >> I have a stylistic question/suggestion. These are basically Java heap accessors, shouldn't they go through `HeapAccess::{load,store}`? This would also match the style already used in this file. > > We don't use HeapAccess to access primitive values in objects. HeapAccess is only used when we access oops. > > We do use RawAccess in some of these functions though, but we do that because there's no support for MQ_SEQ_CST in the Atomic APIs. Ideally the SEQ_CST support would live in Atomic where it arguably belongs. Then we could remove all primitive support from the Access API and have a clear distinction that it's an oop thing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1554000301 From duke at openjdk.org Fri Apr 5 17:47:03 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Fri, 5 Apr 2024 17:47:03 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 19:19:59 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s >> >> Performance, no intrinsic: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57... > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > remove use of jdk.crypto.ec @ascarpino Hi Tony, this is the ECC P256 PR we talked about last year, would appreciate your feedback. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2040325424 From coleenp at openjdk.org Fri Apr 5 20:38:59 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 5 Apr 2024 20:38:59 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:23:50 GMT, Patricio Chilano Mateo wrote: > There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. > > Thanks, > Patricio src/hotspot/share/runtime/frame.cpp line 888: > 886: assert(is_interpreted_frame(), "Not an interpreted frame"); > 887: Thread *thread = Thread::current(); > 888: DEBUG_ONLY(ResourceMark rm(thread);) // ~InterpreterOopMap already handles possible deallocation of bitmask I don't like that this is in debug mode only either. We have the current thread so that's a part of the cost of ResourceMark (or historically been the cost). I wonder if this "optimization" can be observed. I'd rather the explicit resource allocation deletion be removed also, since this is surprising. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1554272912 From sgibbons at openjdk.org Fri Apr 5 21:53:50 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 5 Apr 2024 21:53:50 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v5] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fixed generate_fill when count > 0x80000000 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/8bed1561..b025318f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=03-04 Stats: 13 lines in 2 files changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From dlong at openjdk.org Fri Apr 5 22:02:19 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 5 Apr 2024 22:02:19 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 20:36:47 GMT, Coleen Phillimore wrote: >> There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. >> >> Thanks, >> Patricio > > src/hotspot/share/runtime/frame.cpp line 888: > >> 886: assert(is_interpreted_frame(), "Not an interpreted frame"); >> 887: Thread *thread = Thread::current(); >> 888: DEBUG_ONLY(ResourceMark rm(thread);) // ~InterpreterOopMap already handles possible deallocation of bitmask > > I don't like that this is in debug mode only either. We have the current thread so that's a part of the cost of ResourceMark (or historically been the cost). I wonder if this "optimization" can be observed. I'd rather the explicit resource allocation deletion be removed also, since this is surprising. I'm guessing the explicit resource array deletion was an attempt to save memory if the ResourceMark was outside a loop iterating frames. But if the ResourceMark is inside the loop, it seems pointless. I think uses of other frame iterators like vframes have the ResourceMark on the outside, and that apparently hasn't caused memory footprint issues. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1554348880 From sgibbons at openjdk.org Fri Apr 5 22:07:23 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 5 Apr 2024 22:07:23 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v5] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 5 Apr 2024 21:53:50 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fixed generate_fill when count > 0x80000000 Thanks for all your thoughtful feedback. I would very much like to take "the right approach"(tm) but I don't have the skill to write IR, especially given that this is an Unsafe block, which is restricted by atomicity and alignment. I would not know how to prevent the C2 optimizer from vectorizing, or indeed replacing my code with a call to memset(). I'm not sure it would go this far, but in order to remain compliant with the spec I have to prevent it in the future. This was modeled after the existing implementation of copyMemory, gives good performance (3-5x), and can serve as a template for other platform developers to follow. They have the expertise for their specific platform(s) which I do not have. Again, thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2040688225 From dlong at openjdk.org Fri Apr 5 22:13:11 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 5 Apr 2024 22:13:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 11:56:11 GMT, Stefan Karlsson wrote: > We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? src/hotspot/share/classfile/systemDictionary.cpp line 370: > 368: } > 369: } else { > 370: k = Universe::typeArrayKlass(t); Suggestion: TypeArrayKlass* tak = Universe::typeArrayKlass(t); k = tak->array_klass(ndims, CHECK_NULL); src/hotspot/share/classfile/systemDictionary.cpp line 371: > 369: } else { > 370: k = Universe::typeArrayKlass(t); > 371: k = k->array_klass(ndims, CHECK_NULL); I assume the cast was an attempt to de-virtualize the array_klass() call, so it is better not to use Klass* here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1554362911 PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1554363063 From dlong at openjdk.org Fri Apr 5 23:45:59 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 5 Apr 2024 23:45:59 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 19:52:18 GMT, Patricio Chilano Mateo wrote: > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio This looks good, but have you considered computing the value every time instead of caching it in _num_stack_arg_slots and increasing the size of every nmethod? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2040770449 From kvn at openjdk.org Fri Apr 5 23:53:28 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 5 Apr 2024 23:53:28 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 Message-ID: Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. - Cleanup some `(nmethod*)` casts. - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. Testing tier1-3,xcomp,stress ------------- Commit messages: - Removed commented code. Fix indent. - 8329628: Additional changes after JDK-8329332 Changes: https://git.openjdk.org/jdk/pull/18665/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18665&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329628 Stats: 103 lines in 34 files changed: 15 ins; 21 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/18665.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18665/head:pull/18665 PR: https://git.openjdk.org/jdk/pull/18665 From sgibbons at openjdk.org Sat Apr 6 00:13:26 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 6 Apr 2024 00:13:26 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v6] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <9BH6kkaQU5kSjlJUnNenUeWBK2EdahCuks8qEUjDlv0=.b8979589-32df-4fa3-b5a6-f56dad76c58d@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Oops ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/b025318f..fd6f04f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=04-05 Stats: 18 lines in 2 files changed: 2 ins; 0 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From fyang at openjdk.org Sat Apr 6 03:15:16 2024 From: fyang at openjdk.org (Fei Yang) Date: Sat, 6 Apr 2024 03:15:16 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv [v2] In-Reply-To: <1zLJ4ekqbB9t_8o4SvCuEsHqpeF2oa0I9v1PCEs1bow=.33fe1276-eb19-4a41-a555-eef6369d4144@github.com> References: <1zLJ4ekqbB9t_8o4SvCuEsHqpeF2oa0I9v1PCEs1bow=.33fe1276-eb19-4a41-a555-eef6369d4144@github.com> Message-ID: On Thu, 4 Apr 2024 13:53:22 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch to update vm flags related to riscv profile? >> Thanks >> >> Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. >> These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > remove optional extensions Updated change LGTM. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18599#pullrequestreview-1984431152 From mli at openjdk.org Sat Apr 6 06:26:11 2024 From: mli at openjdk.org (Hamlin Li) Date: Sat, 6 Apr 2024 06:26:11 GMT Subject: RFR: 8329083: RISC-V: Update profiles supported on riscv [v2] In-Reply-To: References: <1zLJ4ekqbB9t_8o4SvCuEsHqpeF2oa0I9v1PCEs1bow=.33fe1276-eb19-4a41-a555-eef6369d4144@github.com> Message-ID: On Sat, 6 Apr 2024 03:12:35 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> remove optional extensions > > Updated change LGTM. Thanks. Thanks @RealFYang for your quick response and review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18599#issuecomment-2040988763 From mli at openjdk.org Sat Apr 6 06:26:11 2024 From: mli at openjdk.org (Hamlin Li) Date: Sat, 6 Apr 2024 06:26:11 GMT Subject: Integrated: 8329083: RISC-V: Update profiles supported on riscv In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 10:25:31 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch to update vm flags related to riscv profile? > Thanks > > Currently there are vm options like -XX:+UseRVA20U64 and -XX:+UseRVA22U64 on riscv to indicate the supported riscv extension via profiles. > These profiles should be updated to reflect the full supported extensions and new profile like UseRVA23U64 should be added too. This pull request has now been integrated. Changeset: 49d8e638 Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/49d8e6383321dcf152f70998be60695cea7527eb Stats: 96 lines in 3 files changed: 60 ins; 31 del; 5 mod 8329083: RISC-V: Update profiles supported on riscv Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/18599 From kbarrett at openjdk.org Sat Apr 6 13:32:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 6 Apr 2024 13:32:09 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 12:19:44 GMT, Coleen Phillimore wrote: > This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. > > With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. > > Tested with tier1-4. Mostly looks good, with some remaining tidying up to do. src/hotspot/share/gc/shared/oopStorage.cpp line 895: > 893: > 894: // Time after which a notification can be made. > 895: static jlong cleanup_permit_time = 0; This mechanism no longer involves notification, so comment needs to be updated. Maybe "Time when ServiceThread is next permitted to do cleanup." src/hotspot/share/gc/shared/oopStorage.cpp line 897: > 895: static jlong cleanup_permit_time = 0; > 896: > 897: // Minimum time since last ServiceThread check before cleanup is permitted. Maybe "Minimum time between ServiceThread cleanups." src/hotspot/share/gc/shared/oopStorage.cpp line 904: > 902: assert_lock_strong(Service_lock); > 903: > 904: if (Atomic::load(&needs_cleanup_requested) && os::javaTimeNanos() > cleanup_permit_time) { Should be Atomic::load_acquire, matching release_store in record_needs_cleanup. src/hotspot/share/gc/shared/oopStorage.cpp line 920: > 918: void OopStorage::record_needs_cleanup() { > 919: // Set local flag first, else ServiceThread could wake up and miss > 920: // the request. This order may instead (rarely) unnecessarily notify. There's no longer any notification involved. However, there is still the (rare) possibility that the ServiceThread will uselessly run. It might have already been doing cleanup and processed the block just added. If no new cleanup work gets added before the next ServiceThread cleanup time, it will attempt cleanup (because of the flag(s) being set), and find nothing to do. That's okay. Or just delete the sentence about unnecessary notify. src/hotspot/share/gc/shared/oopStorage.cpp line 928: > 926: // Service thread might have oopstorage work, but not for this object. > 927: // Check for deferred updates even though that's not a ServiceThread > 928: // cleanup; since we're here, we might as well process them. That's not what's really going on here. Replace the comment with "But check for deferred updates, which might provide cleanup work." Also, in previous unchanged line, s/Service thread/ServiceThread/ src/hotspot/share/gc/shared/oopStorage.cpp line 988: > 986: // Exceeded work limit or can't delete last block. This will > 987: // cause the ServiceThread to loop, giving other subtasks an > 988: // opportunity to run too. There's no need for a notification, With the changes to `has_cleanup_work_and_reset` this no longer causes the ServiceThread to loop. Instead it requests cleanup at the next scheduled time for the ServiceThread to do so. And there's no longer ever any notification, so the final sentence needs some adjustment. src/hotspot/share/runtime/serviceThread.cpp line 130: > 128: ) == 0) { > 129: // Wait until notified that there is some work to do or timer expires. > 130: // OopStorage work needs to be done at periodic intervals. Rather than calling out OopStorage here, maybe just say some cleanup requests don't notify the ServiceThread, instead relying on it to run periodically. After this change we might want to audit other cleanup requests and decide if they actually need to notify the ServiceThread in order to get a more prompt response, or could just wait for the next periodic wakeup. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18375#pullrequestreview-1984494587 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554582541 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554582637 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554583069 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554584073 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554584508 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554585100 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554585933 From kbarrett at openjdk.org Sat Apr 6 13:36:08 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 6 Apr 2024 13:36:08 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: References: Message-ID: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> On Sat, 6 Apr 2024 13:14:36 GMT, Kim Barrett wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > src/hotspot/share/gc/shared/oopStorage.cpp line 988: > >> 986: // Exceeded work limit or can't delete last block. This will >> 987: // cause the ServiceThread to loop, giving other subtasks an >> 988: // opportunity to run too. There's no need for a notification, > > With the changes to `has_cleanup_work_and_reset` this no longer causes the ServiceThread to loop. > Instead it requests cleanup at the next scheduled time for the ServiceThread to do so. And there's no > longer ever any notification, so the final sentence needs some adjustment. Hm, with the change to `has_cleanup_work_and_reset` this will result in the service thread deleting no more than "work limit" blocks per "defer period". Maybe this should reset the "permit time" too, so that it _does_ cause the ServiceThread to loop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1554587381 From jsjolen at openjdk.org Sat Apr 6 14:48:28 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sat, 6 Apr 2024 14:48:28 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v24] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Reformat ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/1a3b8a22..262b4ca4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=22-23 Stats: 24 lines in 1 file changed: 9 ins; 7 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Sat Apr 6 22:10:09 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sat, 6 Apr 2024 22:10:09 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: On Fri, 22 Mar 2024 13:19:45 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/nmtTreap.hpp line 54: > >> 52: uint64_t _priority; >> 53: K _key; >> 54: V _value; > > Should both key and value be const? After all, you don't want to modify either after the node was added to the tree. Value should not be const, but key should be. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1554720882 From jsjolen at openjdk.org Sat Apr 6 22:47:44 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sat, 6 Apr 2024 22:47:44 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v25] In-Reply-To: References: Message-ID: <9sVvyAUhOzl9DAH97oC59CuecmKSC5HXaagAsx64vC0=.d9c5c275-be40-464d-8f73-f4f8a64dcd2a@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Trust me, I'm an expert. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/262b4ca4..a4a8828c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=23-24 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From dlong at openjdk.org Sun Apr 7 01:52:14 2024 From: dlong at openjdk.org (Dean Long) Date: Sun, 7 Apr 2024 01:52:14 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v6] In-Reply-To: <9BH6kkaQU5kSjlJUnNenUeWBK2EdahCuks8qEUjDlv0=.b8979589-32df-4fa3-b5a6-f56dad76c58d@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <9BH6kkaQU5kSjlJUnNenUeWBK2EdahCuks8qEUjDlv0=.b8979589-32df-4fa3-b5a6-f56dad76c58d@github.com> Message-ID: On Sat, 6 Apr 2024 00:13:26 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Oops I went ahead and tried a pure-Java implementation, and it is faster for small sizes (up to 8) and only about 1.5x slower for larger sizes, so that might make for an interesting fallback if there is no customized assembler implementation available or if the size is known to me small. Ideally, I think we would want C2 to be more aware of setMemory stores, so that it can remove redundant stores, like it does with InitializeNode. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2041271472 From duke at openjdk.org Sun Apr 7 05:18:08 2024 From: duke at openjdk.org (Francesco Nigro) Date: Sun, 7 Apr 2024 05:18:08 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v6] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <9BH6kkaQU5kSjlJUnNenUeWBK2EdahCuks8qEUjDlv0=.b8979589-32df-4fa3-b5a6-f56dad76c58d@github.com> Message-ID: <2X2qG_TCmbIfhM4CCepi7PHttQGFuMXlLgea1Yq15uc=.3d4bdee1-2eed-4df9-bcb4-f08bf8060119@github.com> On Sun, 7 Apr 2024 01:49:01 GMT, Dean Long wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Oops > > I went ahead and tried a pure-Java implementation, and it is faster for small sizes (up to 8) and only about 1.5x slower for larger sizes, so that might make for an interesting fallback if there is no customized assembler implementation available or if the size is known to me small. > > Ideally, I think we would want C2 to be more aware of setMemory stores, so that it can remove redundant stores, like it does with InitializeNode. @dean-long in my old PR I have done the same, choosing a (not yet) configurable cutoff value. See https://github.com/openjdk/jdk/pull/16760 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2041314429 From gcao at openjdk.org Mon Apr 8 01:37:12 2024 From: gcao at openjdk.org (Gui Cao) Date: Mon, 8 Apr 2024 01:37:12 GMT Subject: RFR: 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic In-Reply-To: References: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> Message-ID: <_bQx00USuBVob5JRLOiC11xaJNa3CeHBktZqHTDeuXQ=.30ff21d4-7202-4364-92cd-4dbce91d7f55@github.com> On Thu, 4 Apr 2024 08:15:10 GMT, Fei Yang wrote: >> Hi, I witnessed that some SHA-2 tests are skipped on RISC-V. The supportedCPUFeatures in IntrinsicPredicates.java is not correct for RISC-V, because it should depend on Zvkn extension instead of sha256/sha512. I tested this with QEMU system running linux-6.8 kernel. I used NR_riscv_hwprobe syscall to detect if the system supports the Zvkn extension. Because support for Zvkn extension is not fully tested on real hardwares, the code for detecting and enabling Zvkn extension is not included in this PR. >> >> The code for detecting Zvkn extension >> ``` diff >> diff --git a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp >> index df4a2e347cc..ef99acbf7c5 100644 >> --- a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp >> +++ b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp >> @@ -178,6 +178,13 @@ void RiscvHwprobe::add_features_from_query_result() { >> if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZFH)) { >> VM_Version::ext_Zfh.enable_feature(); >> } >> + if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNED) >> + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNHB) >> + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKB) >> + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKT)) { >> + VM_Version::ext_Zvkn.enable_feature(); >> + } >> if (is_valid(RISCV_HWPROBE_KEY_CPUPERF_0)) { >> VM_Version::unaligned_access.enable_feature( >> query[RISCV_HWPROBE_KEY_CPUPERF_0].value & RISCV_HWPROBE_MISALIGNED_MASK); >> >> >> This IntrinsicPredicates.java CPU matching change should only affect test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnSupportedCPU. java, test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA512IntrinsicsOptionOnSupportedCPU.java test case. Before this patch they are skipped, after this patch they can be selected and pass normally. >> >> We can test test/lib-test/jdk/test/whitebox/CPUInfoTest.java to see the actual CPU Features. >> >> ----------configuration:(0/0)---------- >> ----------System.out:(4/178)---------- >> WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" >> CPUInfo.getAdditionalCPUInfo(): "" >> CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] >> TEST PASSED >> ----------System.err:(2/88)---------- >> >> >> ### Testing >> - [x] Run tier1-3, hotspot:tier4 tests on SOPHON SG2042 (release) >> - [ ] Run tier1-3 tests on ubuntu24(kernel version 6.8 and us... > > LGTM. @robehn who worked on SHA-2 intrinsic on RISC-V might want to take a look. @RealFYang @robehn : Thanks all for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18611#issuecomment-2041710489 From gcao at openjdk.org Mon Apr 8 01:37:12 2024 From: gcao at openjdk.org (Gui Cao) Date: Mon, 8 Apr 2024 01:37:12 GMT Subject: Integrated: 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic In-Reply-To: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> References: <_ODsWQ8fmxP9pD_vamk9rxBBYxeMd7StBQwcOQ2DFts=.5d741439-e57b-4f3a-a4c6-d12d6265cc51@github.com> Message-ID: On Thu, 4 Apr 2024 04:16:27 GMT, Gui Cao wrote: > Hi, I witnessed that some SHA-2 tests are skipped on RISC-V. The supportedCPUFeatures in IntrinsicPredicates.java is not correct for RISC-V, because it should depend on Zvkn extension instead of sha256/sha512. I tested this with QEMU system running linux-6.8 kernel. I used NR_riscv_hwprobe syscall to detect if the system supports the Zvkn extension. Because support for Zvkn extension is not fully tested on real hardwares, the code for detecting and enabling Zvkn extension is not included in this PR. > > The code for detecting Zvkn extension > ``` diff > diff --git a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > index df4a2e347cc..ef99acbf7c5 100644 > --- a/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > +++ b/src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp > @@ -178,6 +178,13 @@ void RiscvHwprobe::add_features_from_query_result() { > if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZFH)) { > VM_Version::ext_Zfh.enable_feature(); > } > + if (is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNED) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKNHB) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKB) > + && is_set(RISCV_HWPROBE_KEY_IMA_EXT_0, RISCV_HWPROBE_EXT_ZVKT)) { > + VM_Version::ext_Zvkn.enable_feature(); > + } > if (is_valid(RISCV_HWPROBE_KEY_CPUPERF_0)) { > VM_Version::unaligned_access.enable_feature( > query[RISCV_HWPROBE_KEY_CPUPERF_0].value & RISCV_HWPROBE_MISALIGNED_MASK); > > > This IntrinsicPredicates.java CPU matching change should only affect test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnSupportedCPU. java, test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHA512IntrinsicsOptionOnSupportedCPU.java test case. Before this patch they are skipped, after this patch they can be selected and pass normally. > > We can test test/lib-test/jdk/test/whitebox/CPUInfoTest.java to see the actual CPU Features. > > ----------configuration:(0/0)---------- > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > ----------System.err:(2/88)---------- > > > ### Testing > - [x] Run tier1-3, hotspot:tier4 tests on SOPHON SG2042 (release) > - [ ] Run tier1-3 tests on ubuntu24(kernel version 6.8 and use qemu-system to boot ubuntu) (release) This pull request has now been integrated. Changeset: 3a3b77dd Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/3a3b77dd4f522e2ca855acca8516e5901c3f2b5a Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod 8329641: RISC-V: Enable some tests related to SHA-2 instrinsic Reviewed-by: fyang, rehn ------------- PR: https://git.openjdk.org/jdk/pull/18611 From duke at openjdk.org Mon Apr 8 02:17:35 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 8 Apr 2024 02:17:35 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v2] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. kuaiwei has updated the pull request incrementally with two additional commits since the last revision: - Move fsm to CodeBuffer - Add fsm for merging ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18467/files - new: https://git.openjdk.org/jdk/pull/18467/files/29e39bf0..8ae4496e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=00-01 Stats: 355 lines in 17 files changed: 337 ins; 1 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From jbhateja at openjdk.org Mon Apr 8 02:35:33 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 8 Apr 2024 02:35:33 GMT Subject: RFR: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 [v2] In-Reply-To: References: Message-ID: > This bug fix patch tightens the predication check for small constant length clear array pattern and relaxes associated feature checks. Modified few comments for clarity. > > Kindly review and approve. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: Cleanup predicates. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18464/files - new: https://git.openjdk.org/jdk/pull/18464/files/05ccc786..9154491a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18464&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18464&range=00-01 Stats: 9 lines in 3 files changed: 0 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18464.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18464/head:pull/18464 PR: https://git.openjdk.org/jdk/pull/18464 From jbhateja at openjdk.org Mon Apr 8 02:38:59 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 8 Apr 2024 02:38:59 GMT Subject: RFR: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 [v2] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 16:40:31 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup predicates. > > src/hotspot/cpu/x86/x86.ad line 1755: > >> 1753: case Op_ClearArray: >> 1754: if ((size_in_bits != 512) && !VM_Version::supports_avx512vl()) { >> 1755: return false; > > Please add comment to clarify condition. I am reading it as ClearArray will not be supported for NOT avx512 because we can have vector length 512 bits for not avx512. This is only pertinent to known sized clear arrays which are optimized for AVX-512 targets, we already have such a check as part of matcher predicate, so removing it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18464#discussion_r1555163994 From iklam at openjdk.org Mon Apr 8 04:56:28 2024 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 8 Apr 2024 04:56:28 GMT Subject: RFR: 8329728: Read arbitrarily long lines in ClassListParser Message-ID: Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. ------------- Commit messages: - 8329728: Read arbitrarily long lines in ClassListParser Changes: https://git.openjdk.org/jdk/pull/18669/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329728 Stats: 235 lines in 5 files changed: 193 ins; 18 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/18669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 PR: https://git.openjdk.org/jdk/pull/18669 From gcao at openjdk.org Mon Apr 8 04:57:22 2024 From: gcao at openjdk.org (Gui Cao) Date: Mon, 8 Apr 2024 04:57:22 GMT Subject: RFR: 8329823: RISC-V: Need to sync CPU features with related JVM flags Message-ID: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> Hi, As described by [8329823](https://bugs.openjdk.org/browse/JDK-8329823), currently, "features" string is not accurate in that the RISC-V CPU features/extensions which are disabled by user on the command are still added. We need to synchronize these features with related JVM flags so that "features" string can reflect actual usable CPU features. ### Testing - [x] Run tier1 tests on SOPHON SG2042 (release) Results without specifying any jvm flags(After applying this patch) $ /home/zifeihan/jtreg/bin/jtreg -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java ----------System.out:(4/178)---------- WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" CPUInfo.getAdditionalCPUInfo(): "" CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] TEST PASSED Results with specifying `-XX:-UseZba`(After applying this patch) $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:-UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java ----------System.out:(4/158)---------- ----------System.out:(4/169)---------- WB.getCPUFeatures(): "rv64 i m a f d c v zbb zbs zvkn" CPUInfo.getAdditionalCPUInfo(): "" CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zbb, zbs, zvkn] TEST PASSED Results with specifying `-XX:+UseZba`(After applying this patch) $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:+UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java ----------System.out:(4/178)---------- WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" CPUInfo.getAdditionalCPUInfo(): "" CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] TEST PASSED ------------- Commit messages: - 8329823: RISC-V: Need to sync CPU features with related JVM flags Changes: https://git.openjdk.org/jdk/pull/18668/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18668&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329823 Stats: 29 lines in 2 files changed: 18 ins; 2 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18668.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18668/head:pull/18668 PR: https://git.openjdk.org/jdk/pull/18668 From bulasevich at openjdk.org Mon Apr 8 06:12:07 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Mon, 8 Apr 2024 06:12:07 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 19:02:42 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Fix AArch64 build & improve comment about InstructionMark > - Catching up with changes in master > - Catching up with origin/master > - Catch up with origin/master > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 Do you need help understanding the problem? The crash occurred because you removed the line `fprintf(fp, " cbuf.set_insts_mark();\n");` from the generator of AD nodes ::emit() methods. That is why emit_call_reloc finds cbuf.insts->_mark unitialized. // Call Runtime Instruction instruct CallRuntimeDirect(method meth) %{ match(CallRuntime); effect(USE meth); ins_cost(CALL_COST); format %{ "CALL,runtime" %} ins_encode( Java_To_Runtime( meth ), call_epilog ); ins_pipe(simple_call); %} --> void CallRuntimeDirectNode::emit(CodeBuffer& cbuf, PhaseRegAlloc* ra_) const { cbuf.set_insts_mark(); // Start at oper_input_base() and count operands unsigned idx0 = 1; unsigned idx1 = 1; // { #line 1217 "/home/boris/jdk-bulasevich/src/hotspot/cpu/arm/arm.ad" // CALL directly to the runtime emit_call_reloc(cbuf, as_MachCall(), opnd_array(1), runtime_call_Relocation::spec()); #line 999999 } { #line 1213 "/home/boris/jdk-bulasevich/src/hotspot/cpu/arm/arm.ad" // nothing #line 999999 } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2041935047 From thartmann at openjdk.org Mon Apr 8 06:49:10 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 8 Apr 2024 06:49:10 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v12] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Wed, 3 Apr 2024 16:47:32 GMT, Boris Ulasevich wrote: >> These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. >> >> Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17244#pullrequestreview-1985564573 From dholmes at openjdk.org Mon Apr 8 06:55:30 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 06:55:30 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 Message-ID: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. Testing: - regression test 10x on all x64 and aarch64 platforms - tiers 1-4 - GHA Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. Thanks to @fbredber for the Aarch64 and RISCV asm code. Thanks ------------- Commit messages: - Missed a fix for ppc - Merge branch 'master' into 8327743-jni-monitor-count - PPC fixes - Fix old comment. - Initial ppc version - riscv version - thanks Fredrik! - Restructure native library loading code to avoid problem with Driver mode - Aaarch64 version - thanks Fredrik! - Add explanatory comment - Expanded testcases that include leaving the monitor locked when a VT - ... and 11 more: https://git.openjdk.org/jdk/compare/51b0abc8...32a62a27 Changes: https://git.openjdk.org/jdk/pull/18445/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327743 Stats: 508 lines in 10 files changed: 467 ins; 13 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/18445.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18445/head:pull/18445 PR: https://git.openjdk.org/jdk/pull/18445 From dholmes at openjdk.org Mon Apr 8 07:06:05 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 07:06:05 GMT Subject: RFR: 8325303: Replace markWord.is_neutral() with markWord.is_unlocked() In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 06:11:28 GMT, Stefan Karlsson wrote: > In `BasicLock::move_to` you renamed `is_neutral` to `is_locked` should that have stayed as `is_neutral`? Yep - I will fix. Thanks @stefank ------------- PR Comment: https://git.openjdk.org/jdk/pull/17741#issuecomment-2042003796 From stefank at openjdk.org Mon Apr 8 07:16:11 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 07:16:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: Message-ID: > We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/classfile/systemDictionary.cpp Dean's suggestion Co-authored-by: Dean Long <17332032+dean-long at users.noreply.github.com> ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18652/files - new: https://git.openjdk.org/jdk/pull/18652/files/fc2a4a9f..d36f650d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18652&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18652&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18652.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18652/head:pull/18652 PR: https://git.openjdk.org/jdk/pull/18652 From stefank at openjdk.org Mon Apr 8 07:36:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 07:36:09 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 22:10:46 GMT, Dean Long wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Update src/hotspot/share/classfile/systemDictionary.cpp >> >> Dean's suggestion >> >> Co-authored-by: Dean Long <17332032+dean-long at users.noreply.github.com> > > src/hotspot/share/classfile/systemDictionary.cpp line 371: > >> 369: } else { >> 370: k = Universe::typeArrayKlass(t); >> 371: k = k->array_klass(ndims, CHECK_NULL); > > I assume the cast was an attempt to de-virtualize the array_klass() call, so it is better not to use Klass* here. My experience is that these type of casts doesn't make the compiler devirtualize the calls. I tried it now and verified that both with and without the cast we still get the virtual call. You typically need to tell the compiler what function it should be using. (I played around a lot with this when writing the devirtualization layer for the oop_iterate/OopIterateClosure code.) I tested writing the code above as `TypeArrayKlass::cast(k)->TypeArrayKlass::array_klass(ndims, CHECK_NULL)` and that gets rid of the virtual call. However, the compiler still can't inline the code ArrayKlass::array_klass code because it is inside a .cpp file and not an .inline.hpp, so this results in a direct call instead of inlined code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1555350440 From rehn at openjdk.org Mon Apr 8 07:43:00 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Apr 2024 07:43:00 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Fri, 22 Mar 2024 06:26:03 GMT, David Holmes wrote: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks @dholmes-ora, @fbredber thanks you for provding fixes for risc-v! I'll go head and test! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2042067190 From stefank at openjdk.org Mon Apr 8 07:50:00 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 07:50:00 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 23:39:49 GMT, Vladimir Kozlov wrote: > Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). > - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). > - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. > - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. > - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. > - Cleanup some `(nmethod*)` casts. > - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. > > Testing tier1-3,xcomp,stress Looks good. I've added two suggestions. You can choose to make them, handle them as separate issues, or just ignore them. :) src/hotspot/share/code/codeCache.cpp line 668: > 666: CodeBlob* cb = find_blob(start); > 667: assert(cb == nullptr || cb->is_nmethod(), "did not find an nmethod"); > 668: return (nmethod*)cb; There's a call to `find_nmethod` in `ZNMethod::load_oop` that now lacks a null-check. Would you mind adding one? src/hotspot/share/code/codeCache.hpp line 340: > 338: template class CodeBlobIterator : public StackObj { > 339: public: > 340: enum LivenessFilter { all, only_not_unloading }; Thanks, I like this. FWIW, the `only` in `only_not_unloading` seems redundant. The code reads well without it, IMHO: // All nmethods NMethodIterator iter(NMethodIterator::all); // Those that are not unloading NMethodIterator iter(NMethodIterator::not_unloading); ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18665#pullrequestreview-1985672842 PR Review Comment: https://git.openjdk.org/jdk/pull/18665#discussion_r1555370258 PR Review Comment: https://git.openjdk.org/jdk/pull/18665#discussion_r1555364744 From fyang at openjdk.org Mon Apr 8 07:58:09 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 8 Apr 2024 07:58:09 GMT Subject: RFR: 8329823: RISC-V: Need to sync CPU features with related JVM flags In-Reply-To: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> References: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> Message-ID: On Sun, 7 Apr 2024 08:53:57 GMT, Gui Cao wrote: > Hi, As described by [8329823](https://bugs.openjdk.org/browse/JDK-8329823), currently, "features" string is not accurate in that the RISC-V CPU features/extensions which are disabled by user on the command are still added. We need to synchronize these features with related JVM flags so that "features" string can reflect actual usable CPU features. > > > ### Testing > - [x] Run tier1 tests on SOPHON SG2042 (release) > > Results without specifying any jvm flags(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:-UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:-UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/158)---------- > ----------System.out:(4/169)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:+UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:+UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED Looks reasonable to me. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18668#pullrequestreview-1985695250 From gcao at openjdk.org Mon Apr 8 08:16:01 2024 From: gcao at openjdk.org (Gui Cao) Date: Mon, 8 Apr 2024 08:16:01 GMT Subject: RFR: 8329823: RISC-V: Need to sync CPU features with related JVM flags In-Reply-To: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> References: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> Message-ID: <-qch51slhDBjCGxLuhg53wQ_77rIQZ6GwvAVvalmv7E=.b9a620c5-8f23-41b6-a9dd-ee9fbe550602@github.com> On Sun, 7 Apr 2024 08:53:57 GMT, Gui Cao wrote: > Hi, As described by [8329823](https://bugs.openjdk.org/browse/JDK-8329823), currently, "features" string is not accurate in that the RISC-V CPU features/extensions which are disabled by user on the command are still added. We need to synchronize these features with related JVM flags so that "features" string can reflect actual usable CPU features. > > > ### Testing > - [x] Run tier1 tests on SOPHON SG2042 (release) > > Results without specifying any jvm flags(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:-UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:-UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/158)---------- > ----------System.out:(4/169)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:+UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:+UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED @robehn : May I ask if this makes sense to you? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18668#issuecomment-2042128807 From rehn at openjdk.org Mon Apr 8 08:19:00 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Apr 2024 08:19:00 GMT Subject: RFR: 8329823: RISC-V: Need to sync CPU features with related JVM flags In-Reply-To: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> References: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> Message-ID: <9_KZ9c62Qnl8dNAWaplSvaXKTmWNaHcntLwJUGEKOFE=.56e7efd7-4fe1-42fa-a1e2-189f6842bdad@github.com> On Sun, 7 Apr 2024 08:53:57 GMT, Gui Cao wrote: > Hi, As described by [8329823](https://bugs.openjdk.org/browse/JDK-8329823), currently, "features" string is not accurate in that the RISC-V CPU features/extensions which are disabled by user on the command are still added. We need to synchronize these features with related JVM flags so that "features" string can reflect actual usable CPU features. > > > ### Testing > - [x] Run tier1 tests on SOPHON SG2042 (release) > > Results without specifying any jvm flags(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:-UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:-UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/158)---------- > ----------System.out:(4/169)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:+UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:+UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED Yes, I think so, thanks. ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18668#pullrequestreview-1985739239 From stefank at openjdk.org Mon Apr 8 08:31:45 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 08:31:45 GMT Subject: RFR: 8328698: oopDesc::klass_raw() decodes without a null check [v2] In-Reply-To: References: Message-ID: > The oopDesc::klass_raw() function is used when the caller wants to skip asserts. Unfortunately, it skips the the check to see if the narrow klass is zero, which could lead to an incorrect Klass* being returned. This patch fixes this. > > In addition to this, I'm trying to make the code a bit clearer, so the patch also contains changes for the following: > > * The word raw has various different meaning in the context of oops and klasses. So, what does it mean in this context? Does it mean "read the klass pointer value without decoding it"? Or does it mean "decode the klass pointer value without any asserts"? I would like to propose that we use a name that describes that this function is used to skip performing various asserts. > > * I replaced the one usage of load_klass_raw with a call to klass_raw() instead. > > * I restructured the `is_oop_safe` so that we perform the null-check first. Note that `oopDesc::is_oop` performs its own verification of the klass pointer, so if we want extra klass verification in `is_oop_safe` we need to do it before calling the `is_oop` check. > > * I also renamed the _raw functions inside the CompressedKlassPointers klass and moved private functions. > > Tell me if you think some of these should be split up into separate RFEs. > > Tested with tier1-3. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Merge remote-tracking branch 'upstream/master' into 8328698_klass_raw - 8328698: oopDesc::klass_raw() decodes without a null check ------------- Changes: https://git.openjdk.org/jdk/pull/18597/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18597&range=01 Stats: 93 lines in 10 files changed: 42 ins; 34 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18597.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18597/head:pull/18597 PR: https://git.openjdk.org/jdk/pull/18597 From eosterlund at openjdk.org Mon Apr 8 08:32:01 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 8 Apr 2024 08:32:01 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 23:39:49 GMT, Vladimir Kozlov wrote: > Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). > - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). > - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. > - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. > - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. > - Cleanup some `(nmethod*)` casts. > - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. > > Testing tier1-3,xcomp,stress I was never a fan of the CompiledMethod_lock name, as it is quite general but only protects a very specific thing: the state. With the NMethod_lock it gets slightly more awkward since the concurrent GCs already have an "nmethod lock" in the GC data of nmethods. Could this lock be called NMethodState_lock instead, to more clearly describe what exactly it is about nmethods that it guards? ------------- PR Review: https://git.openjdk.org/jdk/pull/18665#pullrequestreview-1985775840 From jsjolen at openjdk.org Mon Apr 8 09:09:44 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 Apr 2024 09:09:44 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v26] In-Reply-To: References: Message-ID: <2ooPswrPGCSinEGA71UTkeQlhFzcKD4VZROBneIYtvc=.54d0f016-6e3b-484a-b50c-4300117f67f7@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Brace initialize size_t ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/a4a8828c..b1c569c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=24-25 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From aboldtch at openjdk.org Mon Apr 8 09:16:18 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 8 Apr 2024 09:16:18 GMT Subject: RFR: 8329839: Cleanup ZPhysicalMemoryBacking trace logging Message-ID: On bsd the MB scaling is only performed on the length and not the base offset so the numbers printed are wrong. On all other platforms the `zoffset` type is used incorrectly and should use `zoffset_end` when printing offsets that point to the end of a range. ------------- Commit messages: - 8329839: Cleanup ZPhysicalMemoryBacking trace logging Changes: https://git.openjdk.org/jdk/pull/18671/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18671&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329839 Stats: 6 lines in 3 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18671.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18671/head:pull/18671 PR: https://git.openjdk.org/jdk/pull/18671 From jsjolen at openjdk.org Mon Apr 8 09:17:12 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 Apr 2024 09:17:12 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: On Fri, 22 Mar 2024 13:24:58 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/vmatree.hpp line 46: > >> 44: >> 45: // Each node has some stack and a flag associated with it. >> 46: struct Metadata { > > all members const? Can't be const unless we want merge to be a function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1555494950 From stefank at openjdk.org Mon Apr 8 09:24:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 09:24:09 GMT Subject: RFR: 8329839: Cleanup ZPhysicalMemoryBacking trace logging In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 09:12:33 GMT, Axel Boldt-Christmas wrote: > On bsd the MB scaling is only performed on the length and not the base offset so the numbers printed are wrong. > > On all other platforms the `zoffset` type is used incorrectly and should use `zoffset_end` when printing offsets that point to the end of a range. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18671#pullrequestreview-1985891086 From stefank at openjdk.org Mon Apr 8 09:32:14 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 09:32:14 GMT Subject: RFR: 8328698: oopDesc::klass_raw() decodes without a null check [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 08:31:45 GMT, Stefan Karlsson wrote: >> The oopDesc::klass_raw() function is used when the caller wants to skip asserts. Unfortunately, it skips the the check to see if the narrow klass is zero, which could lead to an incorrect Klass* being returned. This patch fixes this. >> >> In addition to this, I'm trying to make the code a bit clearer, so the patch also contains changes for the following: >> >> * The word raw has various different meaning in the context of oops and klasses. So, what does it mean in this context? Does it mean "read the klass pointer value without decoding it"? Or does it mean "decode the klass pointer value without any asserts"? I would like to propose that we use a name that describes that this function is used to skip performing various asserts. >> >> * I replaced the one usage of load_klass_raw with a call to klass_raw() instead. >> >> * I restructured the `is_oop_safe` so that we perform the null-check first. Note that `oopDesc::is_oop` performs its own verification of the klass pointer, so if we want extra klass verification in `is_oop_safe` we need to do it before calling the `is_oop` check. >> >> * I also renamed the _raw functions inside the CompressedKlassPointers klass and moved private functions. >> >> Tell me if you think some of these should be split up into separate RFEs. >> >> Tested with tier1-3. > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - Merge remote-tracking branch 'upstream/master' into 8328698_klass_raw > - 8328698: oopDesc::klass_raw() decodes without a null check Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18597#issuecomment-2042283915 From stefank at openjdk.org Mon Apr 8 09:32:14 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 09:32:14 GMT Subject: Integrated: 8328698: oopDesc::klass_raw() decodes without a null check In-Reply-To: References: Message-ID: On Wed, 3 Apr 2024 09:27:16 GMT, Stefan Karlsson wrote: > The oopDesc::klass_raw() function is used when the caller wants to skip asserts. Unfortunately, it skips the the check to see if the narrow klass is zero, which could lead to an incorrect Klass* being returned. This patch fixes this. > > In addition to this, I'm trying to make the code a bit clearer, so the patch also contains changes for the following: > > * The word raw has various different meaning in the context of oops and klasses. So, what does it mean in this context? Does it mean "read the klass pointer value without decoding it"? Or does it mean "decode the klass pointer value without any asserts"? I would like to propose that we use a name that describes that this function is used to skip performing various asserts. > > * I replaced the one usage of load_klass_raw with a call to klass_raw() instead. > > * I restructured the `is_oop_safe` so that we perform the null-check first. Note that `oopDesc::is_oop` performs its own verification of the klass pointer, so if we want extra klass verification in `is_oop_safe` we need to do it before calling the `is_oop` check. > > * I also renamed the _raw functions inside the CompressedKlassPointers klass and moved private functions. > > Tell me if you think some of these should be split up into separate RFEs. > > Tested with tier1-3. This pull request has now been integrated. Changeset: 6f087cbc Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/6f087cbcd5c8c91eb104c6e4297f485dd1a82229 Stats: 93 lines in 10 files changed: 42 ins; 34 del; 17 mod 8328698: oopDesc::klass_raw() decodes without a null check Reviewed-by: ayang, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18597 From rpressler at openjdk.org Mon Apr 8 09:56:11 2024 From: rpressler at openjdk.org (Ron Pressler) Date: Mon, 8 Apr 2024 09:56:11 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames In-Reply-To: References: Message-ID: <1m_bLGdIy6KV_wwHNjtgC_ERSAn_5XRCffBH8jLOIU0=.b18316c3-1a76-4909-b0f7-7bc7f137e253@github.com> On Thu, 4 Apr 2024 19:52:18 GMT, Patricio Chilano Mateo wrote: > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio src/hotspot/share/code/nmethod.cpp line 805: > 803: init_defaults(); > 804: _entry_bci = entry_bci; > 805: _num_stack_arg_slots = entry_bci != InvocationEntryBci ? 0 : _method->constMethod()->num_stack_arg_slots(); If I understand correctly, is the condition on this line the actual fix? test/jdk/jdk/internal/vm/Continuation/OSRTest.java line 77: > 75: cont.run(); > 76: if (freezeFast && !thawFast && fooCallCount == 2) { > 77: // All frames freezed in last yield should be compiled freezed -> frozen test/jdk/jdk/internal/vm/Continuation/OSRTest.java line 131: > 129: for (int i = 0; i < 500_000 * fooCallCount; i++) { > 130: } > 131: fooCallCount++; Perhaps use WhiteBox to check if we're OSRed? test/jdk/jdk/internal/vm/Continuation/OSRTest.java line 166: > 164: for (int i = 0; i < 5_000_000 * fooCallCount; i++) { > 165: } > 166: fooCallCount++; Ditto. Perhaps use WhiteBox to check if we're OSRed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1554275189 PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1554277855 PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1554281175 PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1554286381 From jsjolen at openjdk.org Mon Apr 8 10:11:26 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 8 Apr 2024 10:11:26 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v27] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Shorten addresses - Update names ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/b1c569c5..ec6d2788 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=25-26 Stats: 38 lines in 3 files changed: 10 ins; 1 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From duke at openjdk.org Mon Apr 8 11:58:22 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 8 Apr 2024 11:58:22 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request incrementally with one additional commit since the last revision: Fix cross build error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18467/files - new: https://git.openjdk.org/jdk/pull/18467/files/8ae4496e..fe4f4f20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=01-02 Stats: 5 lines in 3 files changed: 1 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From fbredberg at openjdk.org Mon Apr 8 11:59:11 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Apr 2024 11:59:11 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Fri, 22 Mar 2024 06:26:03 GMT, David Holmes wrote: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1666: > 1664: Label L_no_warn; > 1665: __ lwz(R0, in_bytes(JavaThread::jni_monitor_count_offset()), R16_thread); > 1666: __ cmpwi(CCR0, R0, 0); Change to: __ ld(R0, in_bytes(JavaThread::jni_monitor_count_offset()), R16_thread); __ cmpdi(CCR0, R0, 0); Since `_jni_monitor_count` is a double word on PPC64. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555696893 From dholmes at openjdk.org Mon Apr 8 12:35:44 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks David Holmes has updated the pull request incrementally with two additional commits since the last revision: - s/lw/lwu to zero extend flags value - s/lwz/ld/ ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18445/files - new: https://git.openjdk.org/jdk/pull/18445/files/32a62a27..bfe62751 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=00-01 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18445.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18445/head:pull/18445 PR: https://git.openjdk.org/jdk/pull/18445 From rehn at openjdk.org Mon Apr 8 12:35:44 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 12:32:23 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with two additional commits since the last revision: > > - s/lw/lwu to zero extend flags value > - s/lwz/ld/ Marked as reviewed by rehn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1986268678 From rehn at openjdk.org Mon Apr 8 12:35:44 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Fri, 22 Mar 2024 06:26:03 GMT, David Holmes wrote: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks A question, as I have interpret this comment: // 6282335 JNI DetachCurrentThread spec states that all Java monitors // held by this thread must be released. The spec does not distinguish // between JNI-acquired and regular Java monitors. We can only see // regular Java monitors here if monitor enter-exit matching is broken. Before monitor _count_ we did not know if we held any JNI locks. With the count it is possible to iterate the JNI locks if the count is non-zero and unlock. Thus implement what this comment says we are missing, and we always exit with 0 count. A: Should we add that ? (normal thread exit/detach) B: Should virtual threads also do that? We could always call a "SharedRuntime::log_AND_UNLOCK_jni_monitor_still_held" here. Ignoring above, looks good, and passed initial testing. (I'll do some more, no need to wait for that) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2042611144 From dholmes at openjdk.org Mon Apr 8 12:35:44 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <4iRyNpHaGi7R_hbdG3ByuOryQr6M1qXeFjNsxI4__Y0=.6a96250a-43e5-4da1-b145-b4c1b9706758@github.com> On Mon, 8 Apr 2024 12:23:48 GMT, Robbin Ehn wrote: > Thus implement what this comment says we are missing, @robehn the comment isn't saying that we are missing anything. Only JNI DetachThread specifies that it will release all held monitors. When a thread terminates (platform or virtual) there is no specification to say it should also release any still held monitors - so we don't. We find all locked monitors from the in-use monitor list, we don't rely on the counters for that. Thanks for the testing and review. I just pushed a small update for RISC ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2042627732 From dholmes at openjdk.org Mon Apr 8 12:35:44 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 11:56:33 GMT, Fredrik Bredberg wrote: >> David Holmes has updated the pull request incrementally with two additional commits since the last revision: >> >> - s/lw/lwu to zero extend flags value >> - s/lwz/ld/ > > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1666: > >> 1664: Label L_no_warn; >> 1665: __ lwz(R0, in_bytes(JavaThread::jni_monitor_count_offset()), R16_thread); >> 1666: __ cmpwi(CCR0, R0, 0); > > Change to: > > __ ld(R0, in_bytes(JavaThread::jni_monitor_count_offset()), R16_thread); > __ cmpdi(CCR0, R0, 0); > > Since `_jni_monitor_count` is a double word on PPC64. Fixed - there and elsewhere. Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555728171 From fyang at openjdk.org Mon Apr 8 12:35:44 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 12:32:23 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with two additional commits since the last revision: > > - s/lw/lwu to zero extend flags value > - s/lwz/ld/ src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 912: > 910: // Check if this is a virtual thread continuation > 911: Label L_skip_vthread_code; > 912: __ lw(t0, Address(sp, ContinuationEntry::flags_offset())); @fbredber : Nit: maybe it's better to use `lwu` here instead of `lw` to load the 32-bit flags? `lw` would do sign-extension for the upper 32 bits which I don't think is wanted in case when we have more meaningful bits in flags. While `lwu` simply zeros the upper 32 bits. src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 939: > 937: // Check if this is a virtual thread continuation > 938: Label L_skip_vthread_code; > 939: __ lw(t0, Address(sp, ContinuationEntry::flags_offset())); Similar here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555721741 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555722171 From rehn at openjdk.org Mon Apr 8 12:35:44 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 12:17:10 GMT, Fei Yang wrote: >> David Holmes has updated the pull request incrementally with two additional commits since the last revision: >> >> - s/lw/lwu to zero extend flags value >> - s/lwz/ld/ > > src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 912: > >> 910: // Check if this is a virtual thread continuation >> 911: Label L_skip_vthread_code; >> 912: __ lw(t0, Address(sp, ContinuationEntry::flags_offset())); > > @fbredber : Nit: maybe it's better to use `lwu` here instead of `lw` to load the 32-bit flags? `lw` would do sign-extension for the upper 32 bits which I don't think is wanted in case when we have more meaningful bits in flags. While `lwu` simply zeros the upper 32 bits. Good catch, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555732702 From fbredberg at openjdk.org Mon Apr 8 12:35:44 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Apr 2024 12:35:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <4aXOUEU0wOG3lZ5tNV4hUDIubLw3ivTkzi7H-dhdXlQ=.dac728ae-cd56-4bd0-8f1d-303f5b68f4f6@github.com> On Mon, 8 Apr 2024 12:24:30 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 912: >> >>> 910: // Check if this is a virtual thread continuation >>> 911: Label L_skip_vthread_code; >>> 912: __ lw(t0, Address(sp, ContinuationEntry::flags_offset())); >> >> @fbredber : Nit: maybe it's better to use `lwu` here instead of `lw` to load the 32-bit flags? `lw` would do sign-extension for the upper 32 bits which I don't think is wanted in case when we have more meaningful bits in flags. While `lwu` simply zeros the upper 32 bits. > > Good catch, thanks! I agree. Sorry, my bad. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555735361 From dholmes at openjdk.org Mon Apr 8 12:35:45 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 12:35:45 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <3B289Vs124poh7kI2n_GSf8NL4SKbEkptEMb2e6tAME=.00a57598-a9ad-4a02-8517-62fb935956b2@github.com> On Mon, 8 Apr 2024 12:17:10 GMT, Fei Yang wrote: >> David Holmes has updated the pull request incrementally with two additional commits since the last revision: >> >> - s/lw/lwu to zero extend flags value >> - s/lwz/ld/ > > src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 912: > >> 910: // Check if this is a virtual thread continuation >> 911: Label L_skip_vthread_code; >> 912: __ lw(t0, Address(sp, ContinuationEntry::flags_offset())); > > @fbredber : Nit: maybe it's better to use `lwu` here instead of `lw` to load the 32-bit flags? `lw` would do sign-extension for the upper 32 bits which I don't think is wanted in case when we have more meaningful bits in flags. While `lwu` simply zeros the upper 32 bits. Thanks for the suggestion @RealFYang I have changed it. > src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 939: > >> 937: // Check if this is a virtual thread continuation >> 938: Label L_skip_vthread_code; >> 939: __ lw(t0, Address(sp, ContinuationEntry::flags_offset())); > > Similar here. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555735546 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555736056 From ayang at openjdk.org Mon Apr 8 13:21:10 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 Apr 2024 13:21:10 GMT Subject: RFR: 8329629: GC interfaces should work directly against nmethod instead of CodeBlob In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 12:32:30 GMT, Stefan Karlsson wrote: > The GCs scan and handles nmethods and ignores CodeBlobs of other kinds. The I propose that we stop sending in CodeBlobs to the GCs and make sure to only give them nmethods. > > I removed `void CodeCache::blobs_do(CodeBlobClosure* f)` since there's no more usage of that function. Is this OK? > > I also opted to skipped calling the GC verification code from the iterator code: > > Universe::heap()->verify_nmethod((nmethod*)cb); > > IMHO, I think it is up to the GCs to decide if they want to perform extra nmethod verification. If someone wants to keep this verification in their favorite GC I can add calls to this function where we used to call CodeCache::blobs_do. > > I've only done limited testing and will run extensive testing concurrent with the review. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18653#pullrequestreview-1986431388 From fbredberg at openjdk.org Mon Apr 8 13:24:13 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 8 Apr 2024 13:24:13 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 12:35:44 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with two additional commits since the last revision: > > - s/lw/lwu to zero extend flags value > - s/lwz/ld/ Changes requested by fbredberg (Committer). src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1660: > 1658: // Check if this is a virtual thread continuation > 1659: Label L_skip_vthread_code; > 1660: __ ld(R0, in_bytes(ContinuationEntry::flags_offset()), R1_SP); Change this back to `lwz` since `_flags` is of type `int`. src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1666: > 1664: Label L_no_warn; > 1665: __ ld(R0, in_bytes(JavaThread::jni_monitor_count_offset()), R16_thread); > 1666: __ cmpwi(CCR0, R0, 0); This should be a `cmpdi` since `_jni_monitor_count` is an `intx`. src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1691: > 1689: // Check if this is a virtual thread continuation > 1690: Label L_skip_vthread_code; > 1691: __ ld(R0, in_bytes(ContinuationEntry::flags_offset()), R1_SP); Change this back to `lwz` since `_flags` is of type `int`. ------------- PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1986418873 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555825855 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555835564 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1555834504 From coleenp at openjdk.org Mon Apr 8 13:33:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 13:33:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> References: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> Message-ID: On Mon, 8 Apr 2024 13:23:29 GMT, Coleen Phillimore wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Update src/hotspot/share/classfile/systemDictionary.cpp >> >> Dean's suggestion >> >> Co-authored-by: Dean Long <17332032+dean-long at users.noreply.github.com> > > src/hotspot/share/classfile/systemDictionary.cpp line 372: > >> 370: TypeArrayKlass* tak = Universe::typeArrayKlass(t); >> 371: k = tak->array_klass(ndims, CHECK_NULL); >> 372: k = k->array_klass(ndims, CHECK_NULL); > > this looks puzzling. Why are there two array_klass calls now? Add short comments to explain. I sort of see now but am getting squint lines. It's not important for performance to eliminate a virtual call here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1555847832 From coleenp at openjdk.org Mon Apr 8 13:33:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 13:33:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: Message-ID: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> On Mon, 8 Apr 2024 07:16:11 GMT, Stefan Karlsson wrote: >> We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Update src/hotspot/share/classfile/systemDictionary.cpp > > Dean's suggestion > > Co-authored-by: Dean Long <17332032+dean-long at users.noreply.github.com> This change just makes it confusing and doesn't add anything. If array_klass changed more, then not sure what to do with these lines. src/hotspot/share/classfile/systemDictionary.cpp line 372: > 370: TypeArrayKlass* tak = Universe::typeArrayKlass(t); > 371: k = tak->array_klass(ndims, CHECK_NULL); > 372: k = k->array_klass(ndims, CHECK_NULL); this looks puzzling. Why are there two array_klass calls now? Add short comments to explain. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18652#pullrequestreview-1986444877 PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1555840081 From mbaesken at openjdk.org Mon Apr 8 13:45:29 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 8 Apr 2024 13:45:29 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v2] In-Reply-To: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: > Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. > It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. > The mentioned memory protection operations related entries look like this : > Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Introduce separate nmethod flush log ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18626/files - new: https://git.openjdk.org/jdk/pull/18626/files/27a319be..63ccaffb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=00-01 Stats: 17 lines in 3 files changed: 16 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18626/head:pull/18626 PR: https://git.openjdk.org/jdk/pull/18626 From stefank at openjdk.org Mon Apr 8 13:46:32 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 13:46:32 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v3] In-Reply-To: References: Message-ID: > We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Revert "Update src/hotspot/share/classfile/systemDictionary.cpp " This reverts commit d36f650dc3bf9729cd8bd138d23bef3dfdb8e4d2. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18652/files - new: https://git.openjdk.org/jdk/pull/18652/files/d36f650d..36bef547 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18652&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18652&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18652.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18652/head:pull/18652 PR: https://git.openjdk.org/jdk/pull/18652 From stefank at openjdk.org Mon Apr 8 13:46:32 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 8 Apr 2024 13:46:32 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> Message-ID: On Mon, 8 Apr 2024 13:28:34 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 372: >> >>> 370: TypeArrayKlass* tak = Universe::typeArrayKlass(t); >>> 371: k = tak->array_klass(ndims, CHECK_NULL); >>> 372: k = k->array_klass(ndims, CHECK_NULL); >> >> this looks puzzling. Why are there two array_klass calls now? Add short comments to explain. > > I sort of see now but am getting squint lines. It's not important for performance to eliminate a virtual call here. Hmm. Having two `array_klass` calls were not intentional. I accepted Dean's suggestion in the GitHub UI, but that didn't remove the old `array_klass`. I think I'll revert that change given that it is not important to devirtualize this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1555870229 From mbaesken at openjdk.org Mon Apr 8 13:48:09 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 8 Apr 2024 13:48:09 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> <6slsaND3GbbRLB78XSC2T8FcTEDpw3y3MQ8QZWRVYC8=.b1a36386-0aff-40d0-b1a5-7f8315122dfb@github.com> Message-ID: On Thu, 4 Apr 2024 13:14:39 GMT, Matthias Baesken wrote: > But on the other hand, if others like this idea, I am fine with it (creating sections for memory protection operations and for nmethod flushing). I asked around in my team and added a separate section for the nmethod flush operations. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2042803730 From coleenp at openjdk.org Mon Apr 8 13:58:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 13:58:11 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: References: Message-ID: On Sat, 6 Apr 2024 12:52:10 GMT, Kim Barrett wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > src/hotspot/share/gc/shared/oopStorage.cpp line 895: > >> 893: >> 894: // Time after which a notification can be made. >> 895: static jlong cleanup_permit_time = 0; > > This mechanism no longer involves notification, so comment needs to be updated. Maybe > "Time when ServiceThread is next permitted to do cleanup." I just changed notification to cleanup. We already say the ServiceThread lots of places. > src/hotspot/share/gc/shared/oopStorage.cpp line 897: > >> 895: static jlong cleanup_permit_time = 0; >> 896: >> 897: // Minimum time since last ServiceThread check before cleanup is permitted. > > Maybe "Minimum time between ServiceThread cleanups." that looks better. > src/hotspot/share/gc/shared/oopStorage.cpp line 904: > >> 902: assert_lock_strong(Service_lock); >> 903: >> 904: if (Atomic::load(&needs_cleanup_requested) && os::javaTimeNanos() > cleanup_permit_time) { > > Should be Atomic::load_acquire, matching release_store in record_needs_cleanup. oh yes, that should be load_acquire to match. > src/hotspot/share/gc/shared/oopStorage.cpp line 920: > >> 918: void OopStorage::record_needs_cleanup() { >> 919: // Set local flag first, else ServiceThread could wake up and miss >> 920: // the request. This order may instead (rarely) unnecessarily notify. > > There's no longer any notification involved. However, there is still the (rare) possibility that the ServiceThread > will uselessly run. It might have already been doing cleanup and processed the block just added. If no new > cleanup work gets added before the next ServiceThread cleanup time, it will attempt cleanup (because of the > flag(s) being set), and find nothing to do. That's okay. Or just delete the sentence about unnecessary notify. I deleted the sentence. It wasn't clear what it meant anyway. There is a sentence about notification in the comment above the function but I think that's trying to say why this doesn't do a notification like all other ServiceThread cleanups, so is still saying something that might be useful. > src/hotspot/share/gc/shared/oopStorage.cpp line 928: > >> 926: // Service thread might have oopstorage work, but not for this object. >> 927: // Check for deferred updates even though that's not a ServiceThread >> 928: // cleanup; since we're here, we might as well process them. > > That's not what's really going on here. Replace the comment with > "But check for deferred updates, which might provide cleanup work." > Also, in previous unchanged line, s/Service thread/ServiceThread/ Ok, I see. The deferred updates may create an empty block which can then be cleaned up here? The suggested comment seems to make sense. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555866403 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555867392 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555868403 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555874533 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555876097 From coleenp at openjdk.org Mon Apr 8 13:58:12 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 13:58:12 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> References: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> Message-ID: On Sat, 6 Apr 2024 13:33:11 GMT, Kim Barrett wrote: >> src/hotspot/share/gc/shared/oopStorage.cpp line 988: >> >>> 986: // Exceeded work limit or can't delete last block. This will >>> 987: // cause the ServiceThread to loop, giving other subtasks an >>> 988: // opportunity to run too. There's no need for a notification, >> >> With the changes to `has_cleanup_work_and_reset` this no longer causes the ServiceThread to loop. >> Instead it requests cleanup at the next scheduled time for the ServiceThread to do so. And there's no >> longer ever any notification, so the final sentence needs some adjustment. > > Hm, with the change to `has_cleanup_work_and_reset` this will result in the service thread deleting > no more than "work limit" blocks per "defer period". Maybe this should reset the "permit time" too, > so that it _does_ cause the ServiceThread to loop. The ServiceThread loops because it has a fixed wait timeout now. It won't process these every time it's notified which can be more frequent. This comment seems to say that there's more work, so do it next time. I guess resetting the permit time to zero would prevent these cleanups from getting behind. Are there tests for this condition? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555888210 From coleenp at openjdk.org Mon Apr 8 14:05:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 14:05:10 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 13:46:32 GMT, Stefan Karlsson wrote: >> We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Update src/hotspot/share/classfile/systemDictionary.cpp > " > > This reverts commit d36f650dc3bf9729cd8bd138d23bef3dfdb8e4d2. Marked as reviewed by coleenp (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18652#pullrequestreview-1986548353 From coleenp at openjdk.org Mon Apr 8 14:05:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 14:05:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> Message-ID: On Mon, 8 Apr 2024 13:42:02 GMT, Stefan Karlsson wrote: >> I sort of see now but am getting squint lines. It's not important for performance to eliminate a virtual call here. > > Hmm. Having two `array_klass` calls were not intentional. I accepted Dean's suggestion in the GitHub UI, but that didn't remove the old `array_klass`. I think I'll revert that change given that it is not important to devirtualize this. Oh good because I was going to need a lot more coffee to understand why there was a second call. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1555900867 From pchilanomate at openjdk.org Mon Apr 8 14:16:24 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 8 Apr 2024 14:16:24 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v2] In-Reply-To: References: Message-ID: <1GerIn3ja6yxyEWI40c7TMcCY3Ghedhiarn-QBwERw8=.e07ce8ec-69f6-44b7-8344-801a75154c57@github.com> > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: fix comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18637/files - new: https://git.openjdk.org/jdk/pull/18637/files/07a9cb51..b35306f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18637/head:pull/18637 PR: https://git.openjdk.org/jdk/pull/18637 From pchilanomate at openjdk.org Mon Apr 8 14:16:24 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 8 Apr 2024 14:16:24 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 23:43:27 GMT, Dean Long wrote: > This looks good, but have you considered computing the value every time instead of caching it in _num_stack_arg_slots and increasing the size of every nmethod? > Since this is used in the thaw fast path too I wanted the avoid the extra load of constMethod if possible, but I think either case is fine. Moving _is_unlinked to where the other booleans are defined actually keeps the size of the nmethod same as before (368 bytes). What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2042867162 From pchilanomate at openjdk.org Mon Apr 8 14:16:25 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 8 Apr 2024 14:16:25 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v2] In-Reply-To: <1m_bLGdIy6KV_wwHNjtgC_ERSAn_5XRCffBH8jLOIU0=.b18316c3-1a76-4909-b0f7-7bc7f137e253@github.com> References: <1m_bLGdIy6KV_wwHNjtgC_ERSAn_5XRCffBH8jLOIU0=.b18316c3-1a76-4909-b0f7-7bc7f137e253@github.com> Message-ID: On Fri, 5 Apr 2024 20:39:55 GMT, Ron Pressler wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comment > > src/hotspot/share/code/nmethod.cpp line 805: > >> 803: init_defaults(); >> 804: _entry_bci = entry_bci; >> 805: _num_stack_arg_slots = entry_bci != InvocationEntryBci ? 0 : _method->constMethod()->num_stack_arg_slots(); > > If I understand correctly, is the condition on this line the actual fix? Yes. The point is that _num_stack_arg_slots should not be fixed for a given Method as now but it should depend on the actual nmethod. > test/jdk/jdk/internal/vm/Continuation/OSRTest.java line 77: > >> 75: cont.run(); >> 76: if (freezeFast && !thawFast && fooCallCount == 2) { >> 77: // All frames freezed in last yield should be compiled > > freezed -> frozen Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1555916620 PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1555916840 From pchilanomate at openjdk.org Mon Apr 8 14:21:13 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 8 Apr 2024 14:21:13 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v2] In-Reply-To: <1m_bLGdIy6KV_wwHNjtgC_ERSAn_5XRCffBH8jLOIU0=.b18316c3-1a76-4909-b0f7-7bc7f137e253@github.com> References: <1m_bLGdIy6KV_wwHNjtgC_ERSAn_5XRCffBH8jLOIU0=.b18316c3-1a76-4909-b0f7-7bc7f137e253@github.com> Message-ID: On Fri, 5 Apr 2024 20:46:37 GMT, Ron Pressler wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comment > > test/jdk/jdk/internal/vm/Continuation/OSRTest.java line 131: > >> 129: for (int i = 0; i < 500_000 * fooCallCount; i++) { >> 130: } >> 131: fooCallCount++; > > Perhaps use WhiteBox to check if we're OSRed? I'll test using isMethodCompiled(m, true) as another condition to break the loop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18637#discussion_r1555925286 From lucy at openjdk.org Mon Apr 8 14:35:11 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 8 Apr 2024 14:35:11 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 23 Feb 2024 05:23:29 GMT, Amit Kumar wrote: > s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 > > I had tested `tier1` on `fastdebug`, `release` vm. > > BenchMarking: > > > ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 > > without patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== > > > > with patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== Looks good overall. My change requests have the intention to make identical actions look identical. That helps with understanding the code. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3199: > 3197: NearLabel done, object_has_monitor; > 3198: > 3199: assert_different_registers(temp1, temp2); If you want to make it fool-proof, assert that all four registers are distinct. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3302: > 3300: Register temp = temp1; > 3301: > 3302: const int hdr_offset = oopDesc::mark_offset_in_bytes(); Either use this aux. variable in both, lock and unlock, methods or in none. I prefer using the aux. variable. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3304: > 3302: const int hdr_offset = oopDesc::mark_offset_in_bytes(); > 3303: > 3304: Label done, object_has_monitor, not_recursive; Please insert the same register assert as in the lock method. ------------- Changes requested by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17975#pullrequestreview-1986548916 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1555901183 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1555938529 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1555939449 From pchilanomate at openjdk.org Mon Apr 8 15:04:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 8 Apr 2024 15:04:01 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> On Fri, 5 Apr 2024 09:35:22 GMT, Erik ?sterlund wrote: >> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. >> >> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. >> >> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >> >> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Nits > In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). > So if the race happens only when resetting the stackChunk values when thawing the last frame, wouldn't it be enough to avoid clearing the argsize there? Because if we read the new sp when creating the stack frame iterator, regardless of the argsize value read, is_done() will be true so we won't iterate any frame. I'm trying to understand if the new model is needed to fix the race or that is part of a cleanup/refactoring. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2042991266 From pchilanomate at openjdk.org Mon Apr 8 15:04:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 8 Apr 2024 15:04:01 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> References: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> Message-ID: On Mon, 8 Apr 2024 14:59:47 GMT, Patricio Chilano Mateo wrote: > Unlike thawing, the freeze operation does not race with the GC by design. > Is this with the changes in the allocation code in this patch or even before those there was no race? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2042994418 From iwalulya at openjdk.org Mon Apr 8 15:32:11 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 8 Apr 2024 15:32:11 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 10:05:37 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. >> >> The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: > > - Remove unnecessary comments. > - Fix indentation issue. Changes requested by iwalulya (Reviewer). src/hotspot/share/gc/g1/g1HeapRegion.hpp line 77: > 75: HeapWord* volatile _top; > 76: > 77: G1BlockOffsetTable* _bot; I suppose there is no longer a reason for this to be part of the region. Any downside to referring to it using `g1h->bot()`? src/hotspot/share/gc/g1/g1HeapRegion.inline.hpp line 107: > 105: } > 106: > 107: inline HeapWord* HeapRegion::block_start(const void* addr, HeapWord* const pb) const { Can we move the assert `assert(addr >= _hr->bottom() && addr < _hr->top(), "invalid address");` here? ------------- PR Review: https://git.openjdk.org/jdk/pull/18634#pullrequestreview-1986757217 PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1556042441 PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1556022980 From ayang at openjdk.org Mon Apr 8 15:41:09 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 Apr 2024 15:41:09 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 15:28:46 GMT, Ivan Walulya wrote: >> Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove unnecessary comments. >> - Fix indentation issue. > > src/hotspot/share/gc/g1/g1HeapRegion.hpp line 77: > >> 75: HeapWord* volatile _top; >> 76: >> 77: G1BlockOffsetTable* _bot; > > I suppose there is no longer a reason for this to be part of the region. Any downside to referring to it using `g1h->bot()`? That would require calling `G1CollectedHeap::heap()` every time bot is needed. Actually, maybe some methods, e.g. `update_bot_for_block` should belong to heap, instead of heap-region. (Either way, I feel this decision can/should be made in its own ticket.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1556055383 From gli at openjdk.org Mon Apr 8 16:14:17 2024 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 8 Apr 2024 16:14:17 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v4] In-Reply-To: References: Message-ID: > Hi all, > > This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge branch 'master' into G1BlockOffsetTable - Move assert. - Remove unnecessary comments. - Fix indentation issue. - Use a simple/unified BOT. - JDK-8329603 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18634/files - new: https://git.openjdk.org/jdk/pull/18634/files/a8a121bf..7db09f1d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18634&range=02-03 Stats: 11996 lines in 341 files changed: 4671 ins; 5622 del; 1703 mod Patch: https://git.openjdk.org/jdk/pull/18634.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18634/head:pull/18634 PR: https://git.openjdk.org/jdk/pull/18634 From gli at openjdk.org Mon Apr 8 16:14:17 2024 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 8 Apr 2024 16:14:17 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 15:38:33 GMT, Albert Mingkun Yang wrote: > (Either way, I feel this decision can/should be made in its own ticket.) I agree. This patch should focus on merging. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1556096105 From gli at openjdk.org Mon Apr 8 16:14:18 2024 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 8 Apr 2024 16:14:18 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 15:16:48 GMT, Ivan Walulya wrote: >> Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove unnecessary comments. >> - Fix indentation issue. > > src/hotspot/share/gc/g1/g1HeapRegion.inline.hpp line 107: > >> 105: } >> 106: >> 107: inline HeapWord* HeapRegion::block_start(const void* addr, HeapWord* const pb) const { > > Can we move the assert `assert(addr >= _hr->bottom() && addr < _hr->top(), "invalid address");` here? Moved. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18634#discussion_r1556096013 From amitkumar at openjdk.org Mon Apr 8 16:28:38 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 8 Apr 2024 16:28:38 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: > s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 > > I had tested `tier1` on `fastdebug`, `release` vm. > > BenchMarking: > > > ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 > > without patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== > > > > with patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: suggestion from Lutz ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17975/files - new: https://git.openjdk.org/jdk/pull/17975/files/13f41304..9f48baa9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17975&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17975&range=00-01 Stats: 7 lines in 1 file changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17975/head:pull/17975 PR: https://git.openjdk.org/jdk/pull/17975 From cslucas at openjdk.org Mon Apr 8 16:31:13 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 8 Apr 2024 16:31:13 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 06:09:14 GMT, Boris Ulasevich wrote: > Do you need help understanding the problem? It's hard for me to debug because I don't have direct access to an ARM32. However, I was able to reproduce the problem and I know the reason why it happens (its exactly what you described). I'm working now to find the correct locations to insert the `inst_marks()` _and_ the `clear_inst_marks()`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2043183990 From amitkumar at openjdk.org Mon Apr 8 16:33:11 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 8 Apr 2024 16:33:11 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 14:02:55 GMT, Lutz Schmidt wrote: >> Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: >> >> suggestion from Lutz > > src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3199: > >> 3197: NearLabel done, object_has_monitor; >> 3198: >> 3199: assert_different_registers(temp1, temp2); > > If you want to make it fool-proof, assert that all four registers are distinct. done :-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1556119867 From coleenp at openjdk.org Mon Apr 8 16:51:58 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 16:51:58 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: References: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> Message-ID: On Mon, 8 Apr 2024 13:54:13 GMT, Coleen Phillimore wrote: >> Hm, with the change to `has_cleanup_work_and_reset` this will result in the service thread deleting >> no more than "work limit" blocks per "defer period". Maybe this should reset the "permit time" too, >> so that it _does_ cause the ServiceThread to loop. > > The ServiceThread loops because it has a fixed wait timeout now. It won't process these every time it's notified which can be more frequent. This comment seems to say that there's more work, so do it next time. I guess resetting the permit time to zero would prevent these cleanups from getting behind. Are there tests for this condition? If you can't delete the last block, you don't really want the service thread to try to clean up right away though? Only if you hit the limit of blocks to delete? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1555897504 From coleenp at openjdk.org Mon Apr 8 16:51:59 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 16:51:59 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: References: Message-ID: On Sat, 6 Apr 2024 13:21:45 GMT, Kim Barrett wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > src/hotspot/share/runtime/serviceThread.cpp line 130: > >> 128: ) == 0) { >> 129: // Wait until notified that there is some work to do or timer expires. >> 130: // OopStorage work needs to be done at periodic intervals. > > Rather than calling out OopStorage here, maybe just say some cleanup requests don't notify the > ServiceThread, instead relying on it to run periodically. After this change we might want to audit > other cleanup requests and decide if they actually need to notify the ServiceThread in order to get a > more prompt response, or could just wait for the next periodic wakeup. I reworded the comment. Some might make sense on a periodic timer. Not sure about others. The ones that trigger more are the StringTable and SymbolTable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1556143620 From lucy at openjdk.org Mon Apr 8 17:09:13 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Mon, 8 Apr 2024 17:09:13 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 16:28:38 GMT, Amit Kumar wrote: >> s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 >> >> I had tested `tier1` on `fastdebug`, `release` vm. >> >> BenchMarking: >> >> >> ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 >> >> without patch: >> ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== >> >> >> >> with patch: >> ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > suggestion from Lutz LGTM. Reviewed, provided GHA find no errors. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17975#pullrequestreview-1986981203 From amitkumar at openjdk.org Mon Apr 8 17:31:10 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 8 Apr 2024 17:31:10 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: <4mOo43VwMarKmpMTvZQxPg_MvR2Yi2FBIF2pxr8zxok=.cc47be4a-e9d7-462e-87f9-c40466dd88f5@github.com> On Mon, 8 Apr 2024 17:06:42 GMT, Lutz Schmidt wrote: >> Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: >> >> suggestion from Lutz > > LGTM. > Reviewed, provided GHA find no errors. Thanks @RealLucy !!! @TheRealMDoerr would you please review this as well? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17975#issuecomment-2043291590 From aph at openjdk.org Mon Apr 8 17:39:09 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 8 Apr 2024 17:39:09 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 11:58:22 GMT, kuaiwei wrote: >> The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: >> 1 It show regression in some platform, like Apple silicon in mac os >> 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" >> >> It can be fixed by: >> 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) >> 2 Check the special pattern and merge the subsequent dmb. >> >> It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. >> >> This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. >> >> - Update: >> After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. > > kuaiwei has updated the pull request incrementally with one additional commit since the last revision: > > Fix cross build error That doesn't look bad. I've made a bunch of simplifications for you to see: have a look. In general, pointer chasing is bad, so it's worth getting rid of indirections that don't help. [foo.zip](https://github.com/openjdk/jdk/files/14908996/foo.zip) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2043305914 From coleenp at openjdk.org Mon Apr 8 17:49:22 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 8 Apr 2024 17:49:22 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v2] In-Reply-To: References: Message-ID: > This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. > > With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Some comment cleanups from Kim. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18375/files - new: https://git.openjdk.org/jdk/pull/18375/files/c1816ac8..3ab66efa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=00-01 Stats: 11 lines in 2 files changed: 1 ins; 2 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18375.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18375/head:pull/18375 PR: https://git.openjdk.org/jdk/pull/18375 From ayang at openjdk.org Mon Apr 8 18:44:35 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 Apr 2024 18:44:35 GMT Subject: RFR: 8329878: Reduce public interface of CardTableBarrierSet Message-ID: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> Trivial moving typedef from public to protected. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/18679/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18679&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329878 Stats: 4 lines in 1 file changed: 1 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18679.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18679/head:pull/18679 PR: https://git.openjdk.org/jdk/pull/18679 From kvn at openjdk.org Mon Apr 8 18:59:09 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 8 Apr 2024 18:59:09 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 07:47:02 GMT, Stefan Karlsson wrote: >> Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). >> - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). >> - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. >> - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. >> - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. >> - Cleanup some `(nmethod*)` casts. >> - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. >> >> Testing tier1-3,xcomp,stress > > src/hotspot/share/code/codeCache.cpp line 668: > >> 666: CodeBlob* cb = find_blob(start); >> 667: assert(cb == nullptr || cb->is_nmethod(), "did not find an nmethod"); >> 668: return (nmethod*)cb; > > There's a call to `find_nmethod` in `ZNMethod::load_oop` that now lacks a null-check. Would you mind adding one? I added asserts in all places where there is no nullptr check. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18665#discussion_r1556325619 From sgibbons at openjdk.org Mon Apr 8 19:11:19 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 8 Apr 2024 19:11:19 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v7] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Add movq to locate_operand ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/fd6f04f7..f81aaa9f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=05-06 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From kvn at openjdk.org Mon Apr 8 19:25:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 8 Apr 2024 19:25:10 GMT Subject: RFR: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 02:35:33 GMT, Jatin Bhateja wrote: >> This bug fix patch tightens the predication check for small constant length clear array pattern and relaxes associated feature checks. Modified few comments for clarity. >> >> Kindly review and approve. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup predicates. This looks good. You need second review. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18464#pullrequestreview-1987253711 From cslucas at openjdk.org Mon Apr 8 19:29:01 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 8 Apr 2024 19:29:01 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. Can someone please sponsor this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2043499564 From mdoerr at openjdk.org Mon Apr 8 19:30:10 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 8 Apr 2024 19:30:10 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v2] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Mon, 8 Apr 2024 13:45:29 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Introduce separate nmethod flush log src/hotspot/share/utilities/events.cpp line 102: > 100: if (LogEvents) { > 101: _messages = new StringEventLog("Events", "events"); > 102: _nmethod_flush_messages = new StringEventLog("Nmethod flushs", "nmethodflushs"); Should be "flushes" or "flushing events". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18626#discussion_r1556348203 From kvn at openjdk.org Mon Apr 8 19:57:12 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 8 Apr 2024 19:57:12 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. v09 have to be tested before integration. I submitted it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2043535409 From eosterlund at openjdk.org Mon Apr 8 20:22:08 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 8 Apr 2024 20:22:08 GMT Subject: RFR: 8329629: GC interfaces should work directly against nmethod instead of CodeBlob In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 12:32:30 GMT, Stefan Karlsson wrote: > The GCs scan and handles nmethods and ignores CodeBlobs of other kinds. The I propose that we stop sending in CodeBlobs to the GCs and make sure to only give them nmethods. > > I removed `void CodeCache::blobs_do(CodeBlobClosure* f)` since there's no more usage of that function. Is this OK? > > I also opted to skipped calling the GC verification code from the iterator code: > > Universe::heap()->verify_nmethod((nmethod*)cb); > > IMHO, I think it is up to the GCs to decide if they want to perform extra nmethod verification. If someone wants to keep this verification in their favorite GC I can add calls to this function where we used to call CodeCache::blobs_do. > > I've only done limited testing and will run extensive testing concurrent with the review. Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18653#pullrequestreview-1987338271 From kvn at openjdk.org Mon Apr 8 20:33:34 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 8 Apr 2024 20:33:34 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 [v2] In-Reply-To: References: Message-ID: > Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). > - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). > - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. > - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. > - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. > - Cleanup some `(nmethod*)` casts. > - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. > > Testing tier1-3,xcomp,stress Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Addresse comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18665/files - new: https://git.openjdk.org/jdk/pull/18665/files/5a41bd5b..c33c9a6c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18665&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18665&range=00-01 Stats: 57 lines in 18 files changed: 8 ins; 0 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/18665.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18665/head:pull/18665 PR: https://git.openjdk.org/jdk/pull/18665 From kvn at openjdk.org Mon Apr 8 20:33:34 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 8 Apr 2024 20:33:34 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 23:39:49 GMT, Vladimir Kozlov wrote: > Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). > - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). > - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. > - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. > - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. > - Cleanup some `(nmethod*)` casts. > - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. > > Testing tier1-3,xcomp,stress I did all suggested changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18665#issuecomment-2043591930 From matsaave at openjdk.org Mon Apr 8 21:13:13 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 8 Apr 2024 21:13:13 GMT Subject: RFR: 8329728: Read arbitrarily long lines in ClassListParser In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 04:51:31 GMT, Ioi Lam wrote: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. Overall this looks good! I had some considerations that we discussed offline which I posted below. As we talked about, a size limit of `INT_MAX` is probably too large and is difficult to test or could lead to crashes in the case of overflows due to the use of fgets and os::malloc. You mentioned 1MB as a reasonable max size but I think you may want a larger value since the max symbol size is 64k. Is 16 symbols per line enough? src/hotspot/share/utilities/lineReader.cpp line 51: > 49: } > 50: > 51: char* LineReader::read_line() { Do you think it's worth adding an assert here to make sure the `lineReader `has been initialized? test/hotspot/jtreg/runtime/cds/appcds/customLoader/ClassListFormatA.java line 131: > 129: CDSTestUtils.createArchiveAndCheck(opts) > 130: .shouldContain("Preload Warning: Cannot find " + longName) > 131: .shouldContain("Preload Warning: Cannot find No/Such/ClassABCD"); Could you add a test that checks a line of max size to test overflows? ------------- PR Review: https://git.openjdk.org/jdk/pull/18669#pullrequestreview-1987399960 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1556429231 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1556428800 From dlong at openjdk.org Mon Apr 8 21:13:16 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Apr 2024 21:13:16 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 21:10:04 GMT, Dean Long wrote: >> David Holmes has updated the pull request incrementally with two additional commits since the last revision: >> >> - s/lw/lwu to zero extend flags value >> - s/lwz/ld/ > > src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1052: > >> 1050: Label L_no_warn; >> 1051: __ ldr(rscratch1, Address(rthread, JavaThread::jni_monitor_count_offset())); >> 1052: __ cbz(rscratch1, L_no_warn); > > Suggestion: > > __ cbz(rscratch1, L_skip_vthread_code); Same for other ports. If the count is already 0, no need to store 0 below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1556434268 From dlong at openjdk.org Mon Apr 8 21:13:15 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Apr 2024 21:13:15 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 12:35:44 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with two additional commits since the last revision: > > - s/lw/lwu to zero extend flags value > - s/lwz/ld/ src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1052: > 1050: Label L_no_warn; > 1051: __ ldr(rscratch1, Address(rthread, JavaThread::jni_monitor_count_offset())); > 1052: __ cbz(rscratch1, L_no_warn); Suggestion: __ cbz(rscratch1, L_skip_vthread_code); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1556433882 From vlivanov at openjdk.org Mon Apr 8 21:55:10 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 8 Apr 2024 21:55:10 GMT Subject: RFR: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 02:35:33 GMT, Jatin Bhateja wrote: >> This bug fix patch tightens the predication check for small constant length clear array pattern and relaxes associated feature checks. Modified few comments for clarity. >> >> Kindly review and approve. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup predicates. Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18464#pullrequestreview-1987463585 From dlong at openjdk.org Mon Apr 8 21:57:01 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Apr 2024 21:57:01 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <-hXoWipz6wZ8nU0eev9FsVk9j4CY2D_TkqwFHxXgz7E=.002161e1-2c58-45ee-9bb3-d435937d1177@github.com> On Mon, 8 Apr 2024 12:35:44 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with two additional commits since the last revision: > > - s/lw/lwu to zero extend flags value > - s/lwz/ld/ Isn't continuation_enter_cleanup() also used by yield()? How does monitorEnter(); Thread.yield(); monitorExit() not result in the JNI lock count going to -1? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2043705770 From ccheung at openjdk.org Mon Apr 8 22:10:09 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Mon, 8 Apr 2024 22:10:09 GMT Subject: RFR: 8329728: Read arbitrarily long lines in ClassListParser In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 04:51:31 GMT, Ioi Lam wrote: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. src/hotspot/share/utilities/lineReader.cpp line 44: > 42: void LineReader::init(FILE* file) { > 43: _file = file; > 44: _buffer_len = 16; // start at small size to test expansion logic Maybe set the `_buffer_len` to a larger value (256?) for non-debug build? src/hotspot/share/utilities/lineReader.hpp line 48: > 46: } > 47: > 48: // Return one line from _file, as a NUL-terminated string. The length and contents of this Suggestion: NUL-terminated -> null-terminated ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1556473848 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1556475659 From dlong at openjdk.org Mon Apr 8 22:19:12 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Apr 2024 22:19:12 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v7] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add movq to locate_operand Thanks, I see that my ideas have pretty much already been discussed in https://github.com/openjdk/jdk/pull/16760. I might have missed it, but has the possibility of always setting the aligned interior region with 8 byte stores been discussed? A literal reading of the javadoc seems to disallow it, but it seems like it should be allowed based on memory coherence. Only the unaligned head and tail would need special treatment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2043732829 From dholmes at openjdk.org Mon Apr 8 22:42:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 22:42:21 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v3] In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks David Holmes has updated the pull request incrementally with two additional commits since the last revision: - s/cmpwi/cmpdi for intx value - Restore use of lwz for loading int flags ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18445/files - new: https://git.openjdk.org/jdk/pull/18445/files/bfe62751..59d93389 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18445.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18445/head:pull/18445 PR: https://git.openjdk.org/jdk/pull/18445 From dholmes at openjdk.org Mon Apr 8 22:42:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 22:42:21 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 13:13:17 GMT, Fredrik Bredberg wrote: >> David Holmes has updated the pull request incrementally with two additional commits since the last revision: >> >> - s/lw/lwu to zero extend flags value >> - s/lwz/ld/ > > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1660: > >> 1658: // Check if this is a virtual thread continuation >> 1659: Label L_skip_vthread_code; >> 1660: __ ld(R0, in_bytes(ContinuationEntry::flags_offset()), R1_SP); > > Change this back to `lwz` since `_flags` is of type `int`. Doh! Done. > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1666: > >> 1664: Label L_no_warn; >> 1665: __ ld(R0, in_bytes(JavaThread::jni_monitor_count_offset()), R16_thread); >> 1666: __ cmpwi(CCR0, R0, 0); > > This should be a `cmpdi` since `_jni_monitor_count` is an `intx`. Good catch! Fixed. Thanks > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1691: > >> 1689: // Check if this is a virtual thread continuation >> 1690: Label L_skip_vthread_code; >> 1691: __ ld(R0, in_bytes(ContinuationEntry::flags_offset()), R1_SP); > > Change this back to `lwz` since `_flags` is of type `int`. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1556498829 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1556499870 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1556498928 From dholmes at openjdk.org Mon Apr 8 23:01:11 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 23:01:11 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: <-hXoWipz6wZ8nU0eev9FsVk9j4CY2D_TkqwFHxXgz7E=.002161e1-2c58-45ee-9bb3-d435937d1177@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> <-hXoWipz6wZ8nU0eev9FsVk9j4CY2D_TkqwFHxXgz7E=.002161e1-2c58-45ee-9bb3-d435937d1177@github.com> Message-ID: On Mon, 8 Apr 2024 21:54:44 GMT, Dean Long wrote: > Isn't continuation_enter_cleanup() also used by yield()? @dean-long In general yes but in this case the virtual thread is pinned so the cleanup is skipped. These monitor counts only exist to force pinning. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2043770884 From dholmes at openjdk.org Mon Apr 8 23:01:11 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 Apr 2024 23:01:11 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v2] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 21:10:34 GMT, Dean Long wrote: > If the count is already 0, no need to store 0 below. Good observation. I'm updating x64 and arrach64 and will re-test then do the other ports. Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1556510408 From kvn at openjdk.org Mon Apr 8 23:16:14 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 8 Apr 2024 23:16:14 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. My testing passed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2043800368 From cslucas at openjdk.org Mon Apr 8 23:16:15 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 8 Apr 2024 23:16:15 GMT Subject: Integrated: JDK-8316991: Reduce nullable allocation merges In-Reply-To: References: Message-ID: On Tue, 19 Sep 2023 18:54:34 GMT, Cesar Soares Lucas wrote: > ### Description > > Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. > > Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. > > The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. > > ### Benchmarking > > **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. > **Note 2:** Marging of error was negligible. > > | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | > |--------------------------------------|------------------|-------------------| > | TestTrapAfterMerge | 19.515 | 13.386 | > | TestArgEscape | 33.165 | 33.254 | > | TestCallTwoSide | 70.547 | 69.427 | > | TestCmpAfterMerge | 16.400 | 2.984 | > | TestCmpMergeWithNull_Second | 27.204 | 27.293 | > | TestCmpMergeWithNull | 8.248 | 4.920 | > | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | > | TestCondAfterMergeWithNull | 6.265 | 5.078 | > | TestCondLoadAfterMerge | 12.713 | 5.163 | > | TestConsecutiveSimpleMerge | 30.863 | 4.068 | > | TestDoubleIfElseMerge | 16.069 | 2.444 | > | TestEscapeInCallAfterMerge | 23.111 | 22.924 | > | TestGlobalEscape | 14.459 | 14.425 | > | TestIfElseInLoop | 246.061 | 42.786 | > | TestLoadAfterLoopAlias | 45.808 | 45.812 | > | TestLoadAfterTrap | 28.370 | ... This pull request has now been integrated. Changeset: a887fd21 Author: Cesar Soares Lucas Committer: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/a887fd2144ce067844f18a514afb5078255601ff Stats: 2417 lines in 13 files changed: 2154 ins; 91 del; 172 mod 8316991: Reduce nullable allocation merges Reviewed-by: kvn, vlivanov ------------- PR: https://git.openjdk.org/jdk/pull/15825 From dlong at openjdk.org Mon Apr 8 23:27:11 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Apr 2024 23:27:11 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v3] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 22:42:21 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with two additional commits since the last revision: > > - s/cmpwi/cmpdi for intx value > - Restore use of lwz for loading int flags The tests expect an exit value of 0, but for some reason why I run it they are returning 1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2043827303 From dlong at openjdk.org Mon Apr 8 23:54:10 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 8 Apr 2024 23:54:10 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v3] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Mon, 8 Apr 2024 23:24:30 GMT, Dean Long wrote: > The tests expect an exit value of 0, but for some reason why I run it they are returning 1. Nevermind, user error. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2043882758 From dholmes at openjdk.org Tue Apr 9 00:58:24 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Apr 2024 00:58:24 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks David Holmes has updated the pull request incrementally with one additional commit since the last revision: Avoid unnecessary store when count was already zero. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18445/files - new: https://git.openjdk.org/jdk/pull/18445/files/59d93389..5891800b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=02-03 Stats: 30 lines in 4 files changed: 13 ins; 17 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18445.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18445/head:pull/18445 PR: https://git.openjdk.org/jdk/pull/18445 From lmesnik at openjdk.org Tue Apr 9 01:12:12 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 9 Apr 2024 01:12:12 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v2] In-Reply-To: References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Tue, 2 Apr 2024 23:52:33 GMT, Serguei Spitsyn wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: cleanup - removed temporary logging used for debugging Changes requested by lmesnik (Reviewer). src/hotspot/share/prims/jvmtiEnvBase.cpp line 2070: > 2068: void > 2069: GetSingleStackTraceClosure::do_thread(Thread *target) { > 2070: doit(); I think it makes sense to check that the target is the same as _target_jt. So we don't call it with arbitrary threads. or require parameter to be null if you want. Same for do_vthread. ------------- PR Review: https://git.openjdk.org/jdk/pull/18574#pullrequestreview-1987905071 PR Review Comment: https://git.openjdk.org/jdk/pull/18574#discussion_r1556678646 From lmesnik at openjdk.org Tue Apr 9 01:22:09 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 9 Apr 2024 01:22:09 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake In-Reply-To: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: On Tue, 2 Apr 2024 00:22:28 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > > Ran mach5 tiers 1-6 Changes requested by lmesnik (Reviewer). src/hotspot/share/prims/jvmtiEnvBase.hpp line 503: > 501: _value(value), > 502: _tos(tos) {} > 503: void doit(Thread *target, bool self); No need to use self, you might use _self from doit(). src/hotspot/share/prims/jvmtiEnvBase.hpp line 508: > 506: } > 507: void do_vthread(Handle target_h) { > 508: assert(_target_jt != nullptr, "sanity check"); Better to test that target_h is same as _target_jt. ------------- PR Review: https://git.openjdk.org/jdk/pull/18570#pullrequestreview-1987919902 PR Review Comment: https://git.openjdk.org/jdk/pull/18570#discussion_r1556693843 PR Review Comment: https://git.openjdk.org/jdk/pull/18570#discussion_r1556694346 From jbhateja at openjdk.org Tue Apr 9 01:40:14 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 9 Apr 2024 01:40:14 GMT Subject: Integrated: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 In-Reply-To: References: Message-ID: On Sun, 24 Mar 2024 09:58:59 GMT, Jatin Bhateja wrote: > This bug fix patch tightens the predication check for small constant length clear array pattern and relaxes associated feature checks. Modified few comments for clarity. > > Kindly review and approve. > > Best Regards, > Jatin This pull request has now been integrated. Changeset: fbc1e666 Author: Jatin Bhateja URL: https://git.openjdk.org/jdk/commit/fbc1e6661e26c30a9cf7bc57afd70fde1c642bcb Stats: 19 lines in 5 files changed: 2 ins; 3 del; 14 mod 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 Reviewed-by: kvn, vlivanov ------------- PR: https://git.openjdk.org/jdk/pull/18464 From aph at openjdk.org Tue Apr 9 06:57:00 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 9 Apr 2024 06:57:00 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: <84NvFcV1EZq8MU3FR66BPQP-5bVPYict9dbN7XhzWvY=.fe7eae8e-a105-46f3-a1a7-fefd5cddd074@github.com> On Mon, 8 Apr 2024 11:58:22 GMT, kuaiwei wrote: >> The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: >> 1 It show regression in some platform, like Apple silicon in mac os >> 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" >> >> It can be fixed by: >> 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) >> 2 Check the special pattern and merge the subsequent dmb. >> >> It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. >> >> This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. >> >> - Update: >> After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. > > kuaiwei has updated the pull request incrementally with one additional commit since the last revision: > > Fix cross build error So I had another thought. I think it'd be easier, and more robust, if you kept the contents of the code buffer valid at all times, and transition the state machine to its initial state whenever there is any attempt to get `CodeBuffer::offset()`. That would also minimize the impact of this patch on the rest of the code. You can always back up when you see something like `dmb st; dmb ld; dmb ish`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2044270621 From stefank at openjdk.org Tue Apr 9 06:59:08 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 06:59:08 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 20:33:34 GMT, Vladimir Kozlov wrote: >> Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). >> - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). >> - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. >> - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. >> - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. >> - Cleanup some `(nmethod*)` casts. >> - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. >> >> Testing tier1-3,xcomp,stress > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Addresse comments Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18665#pullrequestreview-1988243308 From dholmes at openjdk.org Tue Apr 9 07:00:12 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Apr 2024 07:00:12 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 17:49:22 GMT, Coleen Phillimore wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Some comment cleanups from Kim. Looks like a good cleanup! Do we have any data on how often and how much oopStorage needs cleaning up? Any time you go to a polling based approach there are concerns that it may be to frequent or too infrequent. What kind of applications tend to require a lot of oopStorage cleaning? src/hotspot/share/runtime/globals.hpp line 1288: > 1286: "Wake the ServiceThread to do periodic cleanup checks" \ > 1287: "(0 means none)") \ > 1288: range(0, max_jint) \ The time unit needs to be mentioned - I assume it is ms? ------------- PR Review: https://git.openjdk.org/jdk/pull/18375#pullrequestreview-1988237148 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1557031817 From duke at openjdk.org Tue Apr 9 07:15:09 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 9 Apr 2024 07:15:09 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: References: Message-ID: <5_Ov-FP292Kqwulyyt0jQClOUvvB6icCjheuYH_ZMJQ=.adc896de-57cd-4f53-b653-29a4bc049d97@github.com> On Mon, 8 Apr 2024 17:36:42 GMT, Andrew Haley wrote: > That doesn't look bad. I've made a bunch of simplifications for you to see: have a look. In general, pointer chasing is bad, so it's worth getting rid of indirections that don't help. [foo.zip](https://github.com/openjdk/jdk/files/14908996/foo.zip) Thanks for your enhancement. I found you removed MergeableInst. My idea is the finite state machine could support both dmb and ld/st instructions. I created another task to merge more instruction like 'ldrs/ldrd/strs/strd ...' . https://bugs.openjdk.org/browse/JDK-8329901 . The old implementation is not suitable to support new instruction. I need duplicate to support other register type like float register. So I want to use MergeableInst to extract instruction information. Such as target register, base register, offseet ... . It's my draft plan, how do you think about it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2044294247 From aph at openjdk.org Tue Apr 9 07:22:59 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 9 Apr 2024 07:22:59 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: <5_Ov-FP292Kqwulyyt0jQClOUvvB6icCjheuYH_ZMJQ=.adc896de-57cd-4f53-b653-29a4bc049d97@github.com> References: <5_Ov-FP292Kqwulyyt0jQClOUvvB6icCjheuYH_ZMJQ=.adc896de-57cd-4f53-b653-29a4bc049d97@github.com> Message-ID: <32mjjPhg6RoDVsk2V9QBJHD9qWS_7R7efHMgLn967QE=.99b580ea-ff31-4e7b-bcda-f1756feef0c5@github.com> On Tue, 9 Apr 2024 07:12:12 GMT, kuaiwei wrote: > > That doesn't look bad. I've made a bunch of simplifications for you to see: have a look. In general, pointer chasing is bad, so it's worth getting rid of indirections that don't help. [foo.zip](https://github.com/openjdk/jdk/files/14908996/foo.zip) > > Thanks for your enhancement. I found you removed MergeableInst. My idea is the finite state machine could support both dmb and ld/st instructions. I created another task to merge more instruction like 'ldrs/ldrd/strs/strd ...' . https://bugs.openjdk.org/browse/JDK-8329901 . The old implementation is not suitable to support new instruction. I need duplicate to support other register type like float register. So I want to use MergeableInst to extract instruction information. Such as target register, base register, offseet ... . It's my draft plan, how do you think about it? I think you may have scaling problems with this. If you try to handle multiple kinds of instructions in a single state machine, it is likely to explode in size, and be (even more) confusing to the reader. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2044305885 From stefank at openjdk.org Tue Apr 9 07:28:40 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 07:28:40 GMT Subject: RFR: 8329629: GC interfaces should work directly against nmethod instead of CodeBlob [v2] In-Reply-To: References: Message-ID: > The GCs scan and handles nmethods and ignores CodeBlobs of other kinds. The I propose that we stop sending in CodeBlobs to the GCs and make sure to only give them nmethods. > > I removed `void CodeCache::blobs_do(CodeBlobClosure* f)` since there's no more usage of that function. Is this OK? > > I also opted to skipped calling the GC verification code from the iterator code: > > Universe::heap()->verify_nmethod((nmethod*)cb); > > IMHO, I think it is up to the GCs to decide if they want to perform extra nmethod verification. If someone wants to keep this verification in their favorite GC I can add calls to this function where we used to call CodeCache::blobs_do. > > I've only done limited testing and will run extensive testing concurrent with the review. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into 8329629_do_code_blob - 8329629: GC interfaces should work directly against nmethod instead of CodeBlob ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18653/files - new: https://git.openjdk.org/jdk/pull/18653/files/e10683e2..1c2bdaea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18653&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18653&range=00-01 Stats: 5545 lines in 115 files changed: 3743 ins; 1387 del; 415 mod Patch: https://git.openjdk.org/jdk/pull/18653.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18653/head:pull/18653 PR: https://git.openjdk.org/jdk/pull/18653 From stefank at openjdk.org Tue Apr 9 07:28:40 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 07:28:40 GMT Subject: RFR: 8329629: GC interfaces should work directly against nmethod instead of CodeBlob In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 12:32:30 GMT, Stefan Karlsson wrote: > The GCs scan and handles nmethods and ignores CodeBlobs of other kinds. The I propose that we stop sending in CodeBlobs to the GCs and make sure to only give them nmethods. > > I removed `void CodeCache::blobs_do(CodeBlobClosure* f)` since there's no more usage of that function. Is this OK? > > I also opted to skipped calling the GC verification code from the iterator code: > > Universe::heap()->verify_nmethod((nmethod*)cb); > > IMHO, I think it is up to the GCs to decide if they want to perform extra nmethod verification. If someone wants to keep this verification in their favorite GC I can add calls to this function where we used to call CodeCache::blobs_do. > > I've only done limited testing and will run extensive testing concurrent with the review. Thanks for the reviews! I'll let GHA complete before integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18653#issuecomment-2044313453 From mbaesken at openjdk.org Tue Apr 9 07:31:23 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 9 Apr 2024 07:31:23 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: > Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. > It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. > The mentioned memory protection operations related entries look like this : > Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: adjust typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18626/files - new: https://git.openjdk.org/jdk/pull/18626/files/63ccaffb..7de2f9e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18626/head:pull/18626 PR: https://git.openjdk.org/jdk/pull/18626 From duke at openjdk.org Tue Apr 9 07:42:01 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 9 Apr 2024 07:42:01 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: <32mjjPhg6RoDVsk2V9QBJHD9qWS_7R7efHMgLn967QE=.99b580ea-ff31-4e7b-bcda-f1756feef0c5@github.com> References: <5_Ov-FP292Kqwulyyt0jQClOUvvB6icCjheuYH_ZMJQ=.adc896de-57cd-4f53-b653-29a4bc049d97@github.com> <32mjjPhg6RoDVsk2V9QBJHD9qWS_7R7efHMgLn967QE=.99b580ea-ff31-4e7b-bcda-f1756feef0c5@github.com> Message-ID: <5rPmI5zcv7NTt30UYdeMpNqtfPMb81r2NOx9IBAIZVw=.a83ab6e6-48cb-4f35-a91d-03b7fb9fd19d@github.com> On Tue, 9 Apr 2024 07:20:28 GMT, Andrew Haley wrote: > > > That doesn't look bad. I've made a bunch of simplifications for you to see: have a look. In general, pointer chasing is bad, so it's worth getting rid of indirections that don't help. [foo.zip](https://github.com/openjdk/jdk/files/14908996/foo.zip) > > > > > > Thanks for your enhancement. I found you removed MergeableInst. My idea is the finite state machine could support both dmb and ld/st instructions. I created another task to merge more instruction like 'ldrs/ldrd/strs/strd ...' . https://bugs.openjdk.org/browse/JDK-8329901 . The old implementation is not suitable to support new instruction. I need duplicate to support other register type like float register. So I want to use MergeableInst to extract instruction information. Such as target register, base register, offseet ... . It's my draft plan, how do you think about it? > > I think you may have scaling problems with this. If you try to handle multiple kinds of instructions in a single state machine, it is likely to explode in size, and be (even more) confusing to the reader. I'm fine with it. So the state machine is only used for dmb. We can figure out how to simplify other instructions later. About CodeBuffer::offset(), does you mean Assembler::offset() ? It worth a try, we may remove the 'const' from its definition and its derived methods. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2044340305 From aph at openjdk.org Tue Apr 9 07:58:09 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 9 Apr 2024 07:58:09 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v3] In-Reply-To: <5rPmI5zcv7NTt30UYdeMpNqtfPMb81r2NOx9IBAIZVw=.a83ab6e6-48cb-4f35-a91d-03b7fb9fd19d@github.com> References: <5_Ov-FP292Kqwulyyt0jQClOUvvB6icCjheuYH_ZMJQ=.adc896de-57cd-4f53-b653-29a4bc049d97@github.com> <32mjjPhg6RoDVsk2V9QBJHD9qWS_7R7efHMgLn967QE=.99b580ea-ff31-4e7b-bcda-f1756feef0c5@github.com> <5rPmI5zcv7NTt30UYdeMpNqtfPMb81r2NOx9IBAIZVw=.a83ab6e6-48cb-4f35-a91d-03b7fb9fd19d@github.com> Message-ID: On Tue, 9 Apr 2024 07:39:01 GMT, kuaiwei wrote: > About CodeBuffer::offset(), does you mean Assembler::offset() ? It worth a try, we may remove the 'const' from its definition and its derived methods. I think so, yes. It's a little bit odd that simply reading an offset has a side effect, but it affects only the state machine, not any of the contents of the buffer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2044373508 From mcimadamore at openjdk.org Tue Apr 9 08:17:02 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Tue, 9 Apr 2024 08:17:02 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v4] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 5 Apr 2024 02:40:16 GMT, Dean Long wrote: > That way C2 can do all its usual optimizations, like unrolling, vectorization, and redundant store elimination (if it is an on-heap primitive array that was just allocated, then there is no need to zero the parts that are being "set"). I second that. It is something that came up quite frequently in the discussions around the FFM API. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2044409509 From gcao at openjdk.org Tue Apr 9 09:36:10 2024 From: gcao at openjdk.org (Gui Cao) Date: Tue, 9 Apr 2024 09:36:10 GMT Subject: RFR: 8329823: RISC-V: Need to sync CPU features with related JVM flags In-Reply-To: References: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> Message-ID: On Mon, 8 Apr 2024 07:55:00 GMT, Fei Yang wrote: >> Hi, As described by [8329823](https://bugs.openjdk.org/browse/JDK-8329823), currently, "features" string is not accurate in that the RISC-V CPU features/extensions which are disabled by user on the command are still added. We need to synchronize these features with related JVM flags so that "features" string can reflect actual usable CPU features. >> >> >> ### Testing >> - [x] Run tier1 tests on SOPHON SG2042 (release) >> >> Results without specifying any jvm flags(After applying this patch) >> >> >> $ /home/zifeihan/jtreg/bin/jtreg -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java >> >> ----------System.out:(4/178)---------- >> WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" >> CPUInfo.getAdditionalCPUInfo(): "" >> CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] >> TEST PASSED >> >> >> Results with specifying `-XX:-UseZba`(After applying this patch) >> >> >> $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:-UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java >> >> ----------System.out:(4/158)---------- >> ----------System.out:(4/169)---------- >> WB.getCPUFeatures(): "rv64 i m a f d c v zbb zbs zvkn" >> CPUInfo.getAdditionalCPUInfo(): "" >> CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zbb, zbs, zvkn] >> TEST PASSED >> >> >> Results with specifying `-XX:+UseZba`(After applying this patch) >> >> >> $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:+UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java >> >> ----------System.out:(4/178)---------- >> WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" >> CPUInfo.getAdditionalCPUInfo(): "" >> CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] >> TEST PASSED > > Looks reasonable to me. @RealFYang @robehn : Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18668#issuecomment-2044563705 From iwalulya at openjdk.org Tue Apr 9 10:39:11 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 9 Apr 2024 10:39:11 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v4] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 16:14:17 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. >> >> The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into G1BlockOffsetTable > - Move assert. > - Remove unnecessary comments. > - Fix indentation issue. > - Use a simple/unified BOT. > - JDK-8329603 Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18634#pullrequestreview-1988704454 From mdoerr at openjdk.org Tue Apr 9 10:41:04 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 9 Apr 2024 10:41:04 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 16:28:38 GMT, Amit Kumar wrote: >> s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 >> >> I had tested `tier1` on `fastdebug`, `release` vm. >> >> BenchMarking: >> >> >> ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 >> >> without patch: >> ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== >> >> >> >> with patch: >> ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== >> ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > suggestion from Lutz I couldn't spot any bugs. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3274: > 3272: // Otherwise, register zero is filled with the current owner. > 3273: z_lghi(zero, 0); > 3274: z_csg(zero, Z_thread, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner), monitor_tagged); May be a bit confusing that `zero` contains the owner, but I can live with it. src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3289: > 3287: // Current thread already owns the lock. Just increment recursion count. > 3288: z_agsi(Address(monitor_tagged, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions)), 1ll); > 3289: z_cgr(zero, zero); // restore CC `// set the CC to EQUAL` would be better, but ok. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17975#pullrequestreview-1988706162 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1557414344 PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1557414941 From gcao at openjdk.org Tue Apr 9 10:44:14 2024 From: gcao at openjdk.org (Gui Cao) Date: Tue, 9 Apr 2024 10:44:14 GMT Subject: Integrated: 8329823: RISC-V: Need to sync CPU features with related JVM flags In-Reply-To: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> References: <7CndD_6EjSJlGUiazMobAPHj2ZOTnMZlQFUDOwv7pKw=.708443c8-9218-4ca1-ae66-e00eb6d8dc53@github.com> Message-ID: On Sun, 7 Apr 2024 08:53:57 GMT, Gui Cao wrote: > Hi, As described by [8329823](https://bugs.openjdk.org/browse/JDK-8329823), currently, "features" string is not accurate in that the RISC-V CPU features/extensions which are disabled by user on the command are still added. We need to synchronize these features with related JVM flags so that "features" string can reflect actual usable CPU features. > > > ### Testing > - [x] Run tier1 tests on SOPHON SG2042 (release) > > Results without specifying any jvm flags(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:-UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:-UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/158)---------- > ----------System.out:(4/169)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zbb, zbs, zvkn] > TEST PASSED > > > Results with specifying `-XX:+UseZba`(After applying this patch) > > > $ /home/zifeihan/jtreg/bin/jtreg -javaoption:-XX:+UseZba -jdk:/home/zifeihan/jre/jdk /home/zifeihan/jdk/test/lib-test/jdk/test/whitebox/CPUInfoTest.java > > ----------System.out:(4/178)---------- > WB.getCPUFeatures(): "rv64 i m a f d c v zba zbb zbs zvkn" > CPUInfo.getAdditionalCPUInfo(): "" > CPUInfo.getFeatures(): [rv64, i, m, a, f, d, c, v, zba, zbb, zbs, zvkn] > TEST PASSED This pull request has now been integrated. Changeset: b9331cd2 Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/b9331cd25ca88b07ce079405f5e3031cf8c13ea6 Stats: 29 lines in 2 files changed: 18 ins; 2 del; 9 mod 8329823: RISC-V: Need to sync CPU features with related JVM flags Reviewed-by: fyang, rehn ------------- PR: https://git.openjdk.org/jdk/pull/18668 From fbredberg at openjdk.org Tue Apr 9 11:22:11 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 9 Apr 2024 11:22:11 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 00:58:24 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Avoid unnecessary store when count was already zero. I've done basic smoke testing on PowerPC using QEMU. `JAVA_OPTIONS=-XX:+CheckJNICalls TEST=test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java ` passes ok. But it would be nice if @TheRealMDoerr or @reinrich could take it for a spin on real hardware. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2044790082 From rrich at openjdk.org Tue Apr 9 11:34:00 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 9 Apr 2024 11:34:00 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 11:19:03 GMT, Fredrik Bredberg wrote: > I've done basic smoke testing on PowerPC using QEMU. `JAVA_OPTIONS=-XX:+CheckJNICalls TEST=test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java ` passes ok. But it would be nice if @TheRealMDoerr or @reinrich could take it for a spin on real hardware. Thanks for the pin. We will do that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2044843569 From tschatzl at openjdk.org Tue Apr 9 12:07:10 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 Apr 2024 12:07:10 GMT Subject: RFR: 8329878: Reduce public interface of CardTableBarrierSet In-Reply-To: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> References: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> Message-ID: On Mon, 8 Apr 2024 18:40:24 GMT, Albert Mingkun Yang wrote: > Trivial moving typedef from public to protected. lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18679#pullrequestreview-1988863078 From ayang at openjdk.org Tue Apr 9 12:31:14 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Apr 2024 12:31:14 GMT Subject: RFR: 8329878: Reduce public interface of CardTableBarrierSet In-Reply-To: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> References: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> Message-ID: On Mon, 8 Apr 2024 18:40:24 GMT, Albert Mingkun Yang wrote: > Trivial moving typedef from public to protected. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18679#issuecomment-2045060570 From ayang at openjdk.org Tue Apr 9 12:31:14 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Apr 2024 12:31:14 GMT Subject: Integrated: 8329878: Reduce public interface of CardTableBarrierSet In-Reply-To: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> References: <-xZL6Kb2xdaMAjrWQR5pJPaxuNtu-XKuIKUcP0NWCpw=.71c465ec-5765-4c19-9566-f544104c4273@github.com> Message-ID: On Mon, 8 Apr 2024 18:40:24 GMT, Albert Mingkun Yang wrote: > Trivial moving typedef from public to protected. This pull request has now been integrated. Changeset: 5ea21c3a Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/5ea21c3a61a7a159d1b88885368741763f42bf04 Stats: 4 lines in 1 file changed: 1 ins; 3 del; 0 mod 8329878: Reduce public interface of CardTableBarrierSet Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18679 From stefank at openjdk.org Tue Apr 9 12:31:21 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 12:31:21 GMT Subject: Integrated: 8329629: GC interfaces should work directly against nmethod instead of CodeBlob In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 12:32:30 GMT, Stefan Karlsson wrote: > The GCs scan and handles nmethods and ignores CodeBlobs of other kinds. The I propose that we stop sending in CodeBlobs to the GCs and make sure to only give them nmethods. > > I removed `void CodeCache::blobs_do(CodeBlobClosure* f)` since there's no more usage of that function. Is this OK? > > I also opted to skipped calling the GC verification code from the iterator code: > > Universe::heap()->verify_nmethod((nmethod*)cb); > > IMHO, I think it is up to the GCs to decide if they want to perform extra nmethod verification. If someone wants to keep this verification in their favorite GC I can add calls to this function where we used to call CodeCache::blobs_do. > > I've only done limited testing and will run extensive testing concurrent with the review. This pull request has now been integrated. Changeset: 87131fb2 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/87131fb2f77188a483fd0852da5f9228aafd5336 Stats: 850 lines in 74 files changed: 238 ins; 318 del; 294 mod 8329629: GC interfaces should work directly against nmethod instead of CodeBlob Reviewed-by: ayang, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/18653 From stefank at openjdk.org Tue Apr 9 12:34:11 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 12:34:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 13:46:32 GMT, Stefan Karlsson wrote: >> We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Update src/hotspot/share/classfile/systemDictionary.cpp > " > > This reverts commit d36f650dc3bf9729cd8bd138d23bef3dfdb8e4d2. Thanks for the reviews! Dean, I reverted the suggestion to go with the typed TypeArrayKlass given that it had no visible effects on inlining. If you still want it I can fix it in separate commit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18652#issuecomment-2045075360 From stefank at openjdk.org Tue Apr 9 12:34:12 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 12:34:12 GMT Subject: Integrated: 8329750: Change Universe functions to return more specific Klass* types In-Reply-To: References: Message-ID: <3vXkW9dJGWq9yHwYBOV4pfnvWAVoR0M2ClYiOYb6oUY=.a1dfe733-b120-475a-a22d-dcb1341ab653@github.com> On Fri, 5 Apr 2024 11:56:11 GMT, Stefan Karlsson wrote: > We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? This pull request has now been integrated. Changeset: 492b954f Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/492b954f81f75cedec50fabc4e6071cabb53acc0 Stats: 43 lines in 6 files changed: 2 ins; 7 del; 34 mod 8329750: Change Universe functions to return more specific Klass* types Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/18652 From amitkumar at openjdk.org Tue Apr 9 13:14:27 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 9 Apr 2024 13:14:27 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v3] In-Reply-To: References: Message-ID: > s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 > > I had tested `tier1` on `fastdebug`, `release` vm. > > BenchMarking: > > > ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 > > without patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== > > > > with patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: updates the comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17975/files - new: https://git.openjdk.org/jdk/pull/17975/files/9f48baa9..af3334e6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17975&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17975&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17975/head:pull/17975 PR: https://git.openjdk.org/jdk/pull/17975 From amitkumar at openjdk.org Tue Apr 9 13:14:27 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 9 Apr 2024 13:14:27 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 10:37:36 GMT, Martin Doerr wrote: >> Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: >> >> suggestion from Lutz > > src/hotspot/cpu/s390/macroAssembler_s390.cpp line 3274: > >> 3272: // Otherwise, register zero is filled with the current owner. >> 3273: z_lghi(zero, 0); >> 3274: z_csg(zero, Z_thread, OM_OFFSET_NO_MONITOR_VALUE_TAG(owner), monitor_tagged); > > May be a bit confusing that `zero` contains the owner, but I can live with it. I have updated the comment, but I don't have better name in mind as of now. `zero_or_owner` might make it worse. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17975#discussion_r1557607131 From gli at openjdk.org Tue Apr 9 13:25:14 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 9 Apr 2024 13:25:14 GMT Subject: RFR: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable [v3] In-Reply-To: References: Message-ID: <2BMCz6Koms0qiSQTIYKMwcXCUhxjvjieCDMqg4aKWC8=.4a9e0574-70fd-47aa-9935-16a7b4995a43@github.com> On Fri, 5 Apr 2024 10:11:18 GMT, Albert Mingkun Yang wrote: >> Guoxiong Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove unnecessary comments. >> - Fix indentation issue. > > Marked as reviewed by ayang (Reviewer). @albertnetymk @walulyai Thanks for your reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18634#issuecomment-2045167755 From gli at openjdk.org Tue Apr 9 13:25:14 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 9 Apr 2024 13:25:14 GMT Subject: Integrated: 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:55:49 GMT, Guoxiong Li wrote: > Hi all, > > This patch merges `G1BlockOffsetTablePart` into `G1BlockOffsetTable`. The previous fields `_reserved` and `_offset_base` of `G1BlockOffsetTable` are marked as `static` so that they can be shared by BOTs of all the heap regions. > > The tests `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong This pull request has now been integrated. Changeset: 5fb5e6c8 Author: Guoxiong Li URL: https://git.openjdk.org/jdk/commit/5fb5e6c8f04e325cbb782431d51251edde4c2618 Stats: 158 lines in 9 files changed: 18 ins; 77 del; 63 mod 8329603: G1: Merge G1BlockOffsetTablePart into G1BlockOffsetTable Reviewed-by: ayang, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/18634 From jsjolen at openjdk.org Tue Apr 9 13:30:26 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 9 Apr 2024 13:30:26 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v28] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fix faulty refactoring ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/ec6d2788..42d58f7f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=26-27 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Apr 9 13:34:25 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 9 Apr 2024 13:34:25 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Style and copyright fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/42d58f7f..1f0e0265 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=28 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=27-28 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From stuefe at openjdk.org Tue Apr 9 13:44:05 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 9 Apr 2024 13:44:05 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:34:25 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Style and copyright fix > Right, the refactoring to remove the `friend` declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked. > > Last good hash: [7445999](https://github.com/openjdk/jdk/commit/7445999ee296872320f91146e1004026ba1133c7) God, sorry. Do as you think is best. I plan to look at this PR, but probably it will not be this week. Love your commit messages btw. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2045208913 From ayang at openjdk.org Tue Apr 9 13:50:35 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 Apr 2024 13:50:35 GMT Subject: RFR: 8329962: Remove CardTable::invalidate Message-ID: Simple converting redundant if-check to assert. ------------- Commit messages: - cardtable-remove-api Changes: https://git.openjdk.org/jdk/pull/18696/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18696&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329962 Stats: 13 lines in 3 files changed: 1 ins; 11 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18696.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18696/head:pull/18696 PR: https://git.openjdk.org/jdk/pull/18696 From coleenp at openjdk.org Tue Apr 9 13:56:28 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 9 Apr 2024 13:56:28 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: Message-ID: > This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. > > With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Specify meaning of ServiceThreadCleanupInterval. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18375/files - new: https://git.openjdk.org/jdk/pull/18375/files/3ab66efa..44df061f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18375.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18375/head:pull/18375 PR: https://git.openjdk.org/jdk/pull/18375 From coleenp at openjdk.org Tue Apr 9 13:56:28 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 9 Apr 2024 13:56:28 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 06:53:10 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Some comment cleanups from Kim. > > src/hotspot/share/runtime/globals.hpp line 1288: > >> 1286: "Wake the ServiceThread to do periodic cleanup checks" \ >> 1287: "(0 means none)") \ >> 1288: range(0, max_jint) \ > > The time unit needs to be mentioned - I assume it is ms? Yes it is ms, like guaranteed safepoint interval, actually PlatformMonitor.lock units, which is ms. I wonder if this is a useful diagnostic option though. Maybe we shouldn't have it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1557670555 From stefank at openjdk.org Tue Apr 9 14:07:14 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 9 Apr 2024 14:07:14 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:34:25 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Style and copyright fix The ZGC changes look neat and clean. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-1989143981 From jsjolen at openjdk.org Tue Apr 9 14:37:26 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 9 Apr 2024 14:37:26 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v30] In-Reply-To: References: Message-ID: <-XBpgvDwey1FM2VUiDnM-c3CnCC-9L0-OQzb1aviyjQ=.b0635811-d6ba-4d2a-8b7f-bbc59ddb4cf4@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Accidentally switched order of state ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/1f0e0265..3793046e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=29 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=28-29 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Apr 9 14:37:26 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 9 Apr 2024 14:37:26 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:41:35 GMT, Thomas Stuefe wrote: > > Right, the refactoring to remove the `friend` declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked. > > Last good hash: [7445999](https://github.com/openjdk/jdk/commit/7445999ee296872320f91146e1004026ba1133c7) > > God, sorry. Do as you think is best. > > I plan to look at this PR, but probably it will not be this week. > > Love your commit messages btw. Nah, it's alright. It was literally that my getter `right()` returned `_left`, this was of course impossible for me to see after staring at the code for too long. Fixing that did lead to a good commit message, I'm happy you appreciate them :-). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2045331994 From jsjolen at openjdk.org Tue Apr 9 14:37:26 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 9 Apr 2024 14:37:26 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:34:25 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Style and copyright fix I decided on the more generic `IntervalChange`, `IntervalState` and `StateType` names. This is naming, I'm perfectly fine switching it to something else. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2045338700 From eosterlund at openjdk.org Tue Apr 9 15:27:10 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 9 Apr 2024 15:27:10 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 20:33:34 GMT, Vladimir Kozlov wrote: >> Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). >> - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). >> - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. >> - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. >> - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. >> - Cleanup some `(nmethod*)` casts. >> - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. >> >> Testing tier1-3,xcomp,stress > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Addresse comments Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18665#pullrequestreview-1989387186 From mli at openjdk.org Tue Apr 9 15:36:13 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 9 Apr 2024 15:36:13 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: <9utr-LKgycFDqg9bdjf6zlINHJcnOhn-uLvU7u9Ls9E=.a1679bbe-c7cd-49cb-a168-4fa14852cee5@github.com> On Fri, 5 Apr 2024 12:17:17 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - disable unused-function warnings; add log msg > - minor Just a quick update, this pr introduces some performance regression compared with previous version (https://github.com/openjdk/jdk/pull/18294) for some math functions (e.g. Double256Vector.COS), and no regression for some others (e.g. Double256Vector.ACOS). I'm investigating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2045495626 From kvn at openjdk.org Tue Apr 9 15:37:13 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 9 Apr 2024 15:37:13 GMT Subject: RFR: 8329628: Additional changes after JDK-8329332 [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 06:56:24 GMT, Stefan Karlsson wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Addresse comments > > Marked as reviewed by stefank (Reviewer). Thank you, @stefank and @fisk, for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18665#issuecomment-2045492029 From kvn at openjdk.org Tue Apr 9 15:37:13 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 9 Apr 2024 15:37:13 GMT Subject: Integrated: 8329628: Additional changes after JDK-8329332 In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 23:39:49 GMT, Vladimir Kozlov wrote: > Additional clean up based on comments (mostly Stefan's) during reviews for [JDK-8329332: Remove CompiledMethod and CodeBlobLayout classes](https://bugs.openjdk.org/browse/JDK-8329332). > - Renamed `CompiledMethod_lock` to `NMethod_lock`. (I decided to not change JVMTI's `CompiledMethod[Load|Unload]` names). > - Renamed `NMethodIterator::all_blobs` to `NMethodIterator::all`. > - Moved `get_deopt_original_pc()` method from `nmethod` to `frame` class. > - Reverted `CodeCache::find_nmethod()` to previous functionality to allow return `nullptr` and be consistent with `find_blob()`. > - Cleanup some `(nmethod*)` casts. > - Use `for (CodeHeap* heap : *_nmethod_heaps) ` in `CodeCache::nmethod_count()` (it was @stefank suggestion, I don't know how this C++ magic works). I verified it running with `-XX:+PrintNMethodStatistics`. > > Testing tier1-3,xcomp,stress This pull request has now been integrated. Changeset: 6736792b Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/6736792b9a711b82b21a5f32cde55f2a3f15ffda Stats: 128 lines in 37 files changed: 23 ins; 21 del; 84 mod 8329628: Additional changes after JDK-8329332 Reviewed-by: stefank, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/18665 From eosterlund at openjdk.org Tue Apr 9 15:46:00 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 9 Apr 2024 15:46:00 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:56:28 GMT, Coleen Phillimore wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Specify meaning of ServiceThreadCleanupInterval. I think I spotted a tiny issue otherwise this looks great! src/hotspot/share/gc/shared/oopStorage.cpp line 910: > 908: // Set the request flag false and return its old value. > 909: // Needs to be atomic to avoid dropping a concurrent request. > 910: Atomic::release_store(&needs_cleanup_requested, false); The comment above seems to imply that needs_cleanup_requested can be set to true atomically and that we therefore have to use Atomic::xchg to make sure we don't drop a concurrent request for cleanup. Now the code has been changed to a release_store instead. If the comment is still right (I think it is?), then the code shouldchange back to xchg. Otherwise the comment should be updated to describe why we don't need xchg any longer, I think. ------------- Changes requested by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18375#pullrequestreview-1989431987 PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1557880835 From kbarrett at openjdk.org Tue Apr 9 16:15:10 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Apr 2024 16:15:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: Message-ID: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> On Tue, 9 Apr 2024 15:42:50 GMT, Erik ?sterlund wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Specify meaning of ServiceThreadCleanupInterval. > > src/hotspot/share/gc/shared/oopStorage.cpp line 910: > >> 908: // Set the request flag false and return its old value. >> 909: // Needs to be atomic to avoid dropping a concurrent request. >> 910: Atomic::release_store(&needs_cleanup_requested, false); > > The comment above seems to imply that needs_cleanup_requested can be set to true atomically and that we therefore have to use Atomic::xchg to make sure we don't drop a concurrent request for cleanup. Now the code has been changed to a release_store instead. If the comment is still right (I think it is?), then the code shouldchange back to xchg. Otherwise the comment should be updated to describe why we don't need xchg any longer, I think. That comment is a left-over from the old mechanism, and is no longer true. This is the only place that sets it false (and is holding Service_lock so there can't be concurrent setters to false), and knows it is true (from the earlier test). The old protocol with the safepoint cleanup trigger was more complex, and this function didn't have the test of the flag. We probably don't even need a "release" here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1557931603 From lucy at openjdk.org Tue Apr 9 16:20:12 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 9 Apr 2024 16:20:12 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: <3oKW4EY92lYqg3_IPS3HmYN-rPr6SH12u8RnAUBiHjo=.8922bd3f-aaa5-42fa-90a9-5362b457f394@github.com> On Tue, 9 Apr 2024 07:31:23 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > adjust typo LGTM - given the scope of this PR. In general, I don't like the event log to be split into multiple streams being printed separately. Yes, separate sections prevent displacement of events by other, too verbose, events. On the other hand, time coherence is lost or has to be manually re-established by the support engineer. Often enough, an issue can only be understood when seeing multiple/all events in timely order. Merging the event sections at print time by timestamp would be a helpful enhancement. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18626#pullrequestreview-1989536222 From eosterlund at openjdk.org Tue Apr 9 16:22:09 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 9 Apr 2024 16:22:09 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> Message-ID: On Tue, 9 Apr 2024 16:12:05 GMT, Kim Barrett wrote: >> src/hotspot/share/gc/shared/oopStorage.cpp line 910: >> >>> 908: // Set the request flag false and return its old value. >>> 909: // Needs to be atomic to avoid dropping a concurrent request. >>> 910: Atomic::release_store(&needs_cleanup_requested, false); >> >> The comment above seems to imply that needs_cleanup_requested can be set to true atomically and that we therefore have to use Atomic::xchg to make sure we don't drop a concurrent request for cleanup. Now the code has been changed to a release_store instead. If the comment is still right (I think it is?), then the code shouldchange back to xchg. Otherwise the comment should be updated to describe why we don't need xchg any longer, I think. > > That comment is a left-over from the old mechanism, and is no longer true. This is the only place > that sets it false (and is holding Service_lock so there can't be concurrent setters to false), and knows > it is true (from the earlier test). The old protocol with the safepoint cleanup trigger was more complex, > and this function didn't have the test of the flag. We probably don't even need a "release" here. I thought the comment about "avoid dropping a concurrent request" was rather talking about the flag being concurrently set from false to true (in record_needs_cleanup), but then immediately getting cleared to false here due to the lack of atomics, leading to the cleanup request being essentially ignored and handled as done, even though nothing was done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1557945427 From kbarrett at openjdk.org Tue Apr 9 16:36:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Apr 2024 16:36:09 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> Message-ID: On Tue, 9 Apr 2024 16:19:50 GMT, Erik ?sterlund wrote: >> That comment is a left-over from the old mechanism, and is no longer true. This is the only place >> that sets it false (and is holding Service_lock so there can't be concurrent setters to false), and knows >> it is true (from the earlier test). The old protocol with the safepoint cleanup trigger was more complex, >> and this function didn't have the test of the flag. We probably don't even need a "release" here. > > I thought the comment about "avoid dropping a concurrent request" was rather talking about the flag being concurrently set from false to true (in record_needs_cleanup), but then immediately getting cleared to false here due to the lack of atomics, leading to the cleanup request being essentially ignored and handled as done, even though nothing was done. If the flag was false at the preceding load-acquire then there wasn't a notification (yet) so we don't attempt to do any work this time around. But there's always next (now at least periodic) time. It was more complicated before the ServiceThread became periodic. (Possibly overly complicated. While discussing these changes with Coleen offline I had trouble understanding some of the old interaction and thought there was a much simpler way to accomplish what it was trying to do. But making the ServiceThread periodic allows even more simplification.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1557969346 From stuefe at openjdk.org Tue Apr 9 17:02:09 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 9 Apr 2024 17:02:09 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 10:26:42 GMT, Joachim Kern wrote: >> src/hotspot/os/aix/os_aix.cpp line 314: >> >>> 312: ErrnoPreserver ep; >>> 313: log_trace(os, map)("disclaim failed: " RANGEFMT " errno=(%s)", >>> 314: RANGEFMTARGS(p, (long)maxDisclaimSize), >> >> Wait, why are these casts needed? maxDisclaimSize is size_t, RANGEFMT uses SIZE_FORMAT. That should work without cast. > > Hi Thomas, `maxDisclaimSize` is of type `unsigned int`; therefore I get the following warning: > > os/aix/os_aix.cpp:314:42: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] > RANGEFMTARGS(p, maxDisclaimSize), > ^~~~~~~~~~~~~~~ > > Should I keep the casts, or change the type of `maxDisclaimSize, numFullDisclaimsNeeded, lastDisclaimSize` to `const unsigned long`? I would change them to size_t. Thanks for doing this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558012122 From stuefe at openjdk.org Tue Apr 9 17:08:11 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 9 Apr 2024 17:08:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 09:19:16 GMT, Joachim Kern wrote: >> Hi Thomas, >> I would like to get totally rid of this, because as I mentioned IBM already modified the `stdlib.h` header not using `#define malloc vec_malloc` any more (and all the other vec_... defines). We have to ask the adoptium colleagues at IBM if they already have raised their build environment by the 2 SP levels needed. >> In principle we had to do the same workaround for `calloc, free,...` too, but they didn't show up as errors in the logging files. >> These lines where never meant to stay for long. Just to be able to compile until IBM fixes the issue, which is done now. > > @suchismith1993 > Hi Suchi, can you please tell me when you will raise your build environment from AIX 7.2 TL5 SP5 to SP7? > I' am asking you, because I want to get rid of this nasty workaround. Pinging @sxa - what build environment does temurin use for AIX? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558020493 From stuefe at openjdk.org Tue Apr 9 17:08:10 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 9 Apr 2024 17:08:10 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: <60xqHKyBKIqrzMqVisUO5M_lQLCNt7OYZ6XcovISOc0=.f4bc36e0-c3f7-4e40-b6b7-69ed46ca37e8@github.com> On Tue, 2 Apr 2024 16:14:12 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > version check not needed anymore src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp line 440: > 438: st->print("pc =" INTPTR_FORMAT " ", (unsigned long)uc->uc_mcontext.jmp_context.iar); > 439: st->print("lr =" INTPTR_FORMAT " ", (unsigned long)uc->uc_mcontext.jmp_context.lr); > 440: st->print("ctr=" INTPTR_FORMAT " ", (unsigned long)uc->uc_mcontext.jmp_context.ctr); p2i src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp line 443: > 441: st->cr(); > 442: for (int i = 0; i < 32; i++) { > 443: st->print("r%-2d=" INTPTR_FORMAT " ", i, (unsigned long)uc->uc_mcontext.jmp_context.gpr[i]); p2i ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558017408 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558017827 From kbarrett at openjdk.org Tue Apr 9 17:13:08 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Apr 2024 17:13:08 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> Message-ID: On Mon, 8 Apr 2024 14:00:19 GMT, Coleen Phillimore wrote: >> The ServiceThread loops because it has a fixed wait timeout now. It won't process these every time it's notified which can be more frequent. This comment seems to say that there's more work, so do it next time. I guess resetting the permit time to zero would prevent these cleanups from getting behind. Are there tests for this condition? > > If you can't delete the last block, you don't really want the service thread to try to clean up right away though? Only if you hit the limit of blocks to delete? Never mind. I misremembered how the work limiting operated. It's not a fixed limit on how much work to do. Rather, it's (roughly) process at most the number of blocks as there were in the list on entry. The point is that if other threads are allocating and then emptying blocks while we're working, that can't cause us to keep working for some potentially arbitrary amount of time. Also, the result indicating more work to do is unused by the ServiceThread. It is used by the gtest for delete_empty_blocks, but I think will never be true as used there. We could change ServiceThread to pay attention to the result, but that would make that code a bit more complicated, and because of how the "work limit" works I don't think there's much benefit to that. There would be if the "work limit" were some arbitrary fixed count, like 10 blocks or something, but since it's not... So I think just further updating the comment is sufficient. I think just keep the first sentence ("Exceeded ... last block.") and delete the rest, about making the ServiceThread loop. I might file a new RFE to do something useful with that bool result or eliminate it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558034765 From iklam at openjdk.org Tue Apr 9 17:22:47 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Apr 2024 17:22:47 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v2] In-Reply-To: References: Message-ID: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. > > Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18669/files - new: https://git.openjdk.org/jdk/pull/18669/files/b0b004dd..034c29b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=00-01 Stats: 205 lines in 6 files changed: 170 ins; 3 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/18669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 PR: https://git.openjdk.org/jdk/pull/18669 From iklam at openjdk.org Tue Apr 9 17:27:10 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Apr 2024 17:27:10 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v2] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 22:04:46 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > > src/hotspot/share/utilities/lineReader.cpp line 44: > >> 42: void LineReader::init(FILE* file) { >> 43: _file = file; >> 44: _buffer_len = 16; // start at small size to test expansion logic > > Maybe set the `_buffer_len` to a larger value (256?) for non-debug build? I changed to _buffer_len = DEBUG_ONLY(16) NOT_DEBUG(4096); > src/hotspot/share/utilities/lineReader.hpp line 48: > >> 46: } >> 47: >> 48: // Return one line from _file, as a NUL-terminated string. The length and contents of this > > Suggestion: NUL-terminated -> null-terminated I changed to `0-terminated` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558052689 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558052661 From iklam at openjdk.org Tue Apr 9 17:27:11 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 9 Apr 2024 17:27:11 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v2] In-Reply-To: References: Message-ID: <27sBvbg3tysR7lqLphAd8AAfGCtLNZn-nx1i9nbdf1Q=.732ad076-5660-41c6-9f48-558c6e22f587@github.com> On Mon, 8 Apr 2024 21:03:52 GMT, Matias Saavedra Silva wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > > src/hotspot/share/utilities/lineReader.cpp line 51: > >> 49: } >> 50: >> 51: char* LineReader::read_line() { > > Do you think it's worth adding an assert here to make sure the `lineReader `has been initialized? Done. > test/hotspot/jtreg/runtime/cds/appcds/customLoader/ClassListFormatA.java line 131: > >> 129: CDSTestUtils.createArchiveAndCheck(opts) >> 130: .shouldContain("Preload Warning: Cannot find " + longName) >> 131: .shouldContain("Preload Warning: Cannot find No/Such/ClassABCD"); > > Could you add a test that checks a line of max size to test overflows? As we discussed off-line, I limited the max width to 4M chars and added a few gtest cases for it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558051727 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558051751 From duke at openjdk.org Tue Apr 9 17:28:09 2024 From: duke at openjdk.org (Stewart X Addison) Date: Tue, 9 Apr 2024 17:28:09 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 17:01:59 GMT, Thomas Stuefe wrote: >> @suchismith1993 >> Hi Suchi, can you please tell me when you will raise your build environment from AIX 7.2 TL5 SP5 to SP7? >> I' am asking you, because I want to get rid of this nasty workaround. > > Pinging @sxa - what build environment does temurin use for AIX? Currently XLC16 but looking to upgrade to XLC17 on the minimum supported level for it (So it wouldn't be SP7 at present). Thanks for the ping - we have no current plans to increase to SP7. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558053537 From coleenp at openjdk.org Tue Apr 9 17:35:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 9 Apr 2024 17:35:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> Message-ID: <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> On Tue, 9 Apr 2024 16:33:48 GMT, Kim Barrett wrote: >> I thought the comment about "avoid dropping a concurrent request" was rather talking about the flag being concurrently set from false to true (in record_needs_cleanup), but then immediately getting cleared to false here due to the lack of atomics, leading to the cleanup request being essentially ignored and handled as done, even though nothing was done. > > If the flag was false at the preceding load-acquire then there wasn't a notification (yet) so we don't > attempt to do any work this time around. But there's always next (now at least periodic) time. It was > more complicated before the ServiceThread became periodic. (Possibly overly complicated. While > discussing these changes with Coleen offline I had trouble understanding some of the old interaction > and thought there was a much simpler way to accomplish what it was trying to do. But making the > ServiceThread periodic allows even more simplification.) Here, if needs_cleanup_requested was true and then we set it to false, we still return true so a concurrent thread setting it to true is ok because we still return 'true'. That is, we need to do some work. If we return false here, while another thread is setting it to true, we'll get it on the next periodic timeout. The only reason for the store-release is because we use load-acquire and it's consistent. The comment refers to the old behaviour which was more complicated. I removed the comment to remove the confusion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558059850 From coleenp at openjdk.org Tue Apr 9 17:35:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 9 Apr 2024 17:35:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> Message-ID: On Tue, 9 Apr 2024 17:30:55 GMT, Coleen Phillimore wrote: >> If the flag was false at the preceding load-acquire then there wasn't a notification (yet) so we don't >> attempt to do any work this time around. But there's always next (now at least periodic) time. It was >> more complicated before the ServiceThread became periodic. (Possibly overly complicated. While >> discussing these changes with Coleen offline I had trouble understanding some of the old interaction >> and thought there was a much simpler way to accomplish what it was trying to do. But making the >> ServiceThread periodic allows even more simplification.) > > Here, if needs_cleanup_requested was true and then we set it to false, we still return true so a concurrent thread setting it to true is ok because we still return 'true'. That is, we need to do some work. If we return false here, while another thread is setting it to true, we'll get it on the next periodic timeout. > The only reason for the store-release is because we use load-acquire and it's consistent. The comment refers to the old behaviour which was more complicated. I removed the comment to remove the confusion. Ok, I think I restated what Kim said. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558061080 From coleenp at openjdk.org Tue Apr 9 17:40:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 9 Apr 2024 17:40:14 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v4] In-Reply-To: References: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> Message-ID: On Tue, 9 Apr 2024 17:10:04 GMT, Kim Barrett wrote: >> If you can't delete the last block, you don't really want the service thread to try to clean up right away though? Only if you hit the limit of blocks to delete? > > Never mind. I misremembered how the work limiting operated. It's not a fixed > limit on how much work to do. Rather, it's (roughly) process at most the > number of blocks as there were in the list on entry. The point is that if > other threads are allocating and then emptying blocks while we're working, > that can't cause us to keep working for some potentially arbitrary amount of > time. > > Also, the result indicating more work to do is unused by the ServiceThread. > It is used by the gtest for delete_empty_blocks, but I think will never be > true as used there. > > We could change ServiceThread to pay attention to the result, but that would > make that code a bit more complicated, and because of how the "work limit" > works I don't think there's much benefit to that. There would be if the "work > limit" were some arbitrary fixed count, like 10 blocks or something, but since > it's not... > > So I think just further updating the comment is sufficient. I think just keep > the first sentence ("Exceeded ... last block.") and delete the rest, about > making the ServiceThread loop. > > I might file a new RFE to do something useful with that bool result or > eliminate it. But we still want to have record_needs_cleanup() right? So that the ServiceThread will find work to do on the next iteration, but it's true that it wont *cause* the service thread to loop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558065125 From coleenp at openjdk.org Tue Apr 9 17:40:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 9 Apr 2024 17:40:14 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v4] In-Reply-To: References: Message-ID: > This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. > > With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: More comment updates. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18375/files - new: https://git.openjdk.org/jdk/pull/18375/files/44df061f..4f2c739a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18375&range=02-03 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18375.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18375/head:pull/18375 PR: https://git.openjdk.org/jdk/pull/18375 From kbarrett at openjdk.org Tue Apr 9 18:12:10 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Apr 2024 18:12:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 06:57:10 GMT, David Holmes wrote: > Looks like a good cleanup! Do we have any data on how often and how much oopStorage needs cleaning up? Any time you go to a polling based approach there are concerns that it may be to frequent or too infrequent. What kind of applications tend to require a lot of oopStorage cleaning? It was previously intended to be polling too, via safepoint cleanups with the minimum safepoint cleanup period. Except the latter has apparently not been operative for some time. So in some sense the oopstorage related changes are a bug fix. The chosen rates are (and were) pretty arbitrary. We don't want unused blocks to hang around indefinitely. On the other hand, it's wasteful to free empty blocks only to have the application need to allocate new blocks soon after. An application that allocates lots of storage entries, uses them for a while, releases them, and is then quiescent (perhaps only for a while), may provide grist for cleaning. String deduplication can have phases like that if some strings are "medium" lifetime. I think graal (and maybe C2?) can have that kind of phasing too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18375#issuecomment-2045805173 From kbarrett at openjdk.org Tue Apr 9 18:12:10 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Apr 2024 18:12:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v4] In-Reply-To: References: <6d2gjVM61eYbJYoLRsNskCaN87IMXXLi1v6RPEUlGJs=.7ca1d5d0-edc9-4273-872f-f3a37d465541@github.com> Message-ID: On Tue, 9 Apr 2024 17:35:33 GMT, Coleen Phillimore wrote: >> Never mind. I misremembered how the work limiting operated. It's not a fixed >> limit on how much work to do. Rather, it's (roughly) process at most the >> number of blocks as there were in the list on entry. The point is that if >> other threads are allocating and then emptying blocks while we're working, >> that can't cause us to keep working for some potentially arbitrary amount of >> time. >> >> Also, the result indicating more work to do is unused by the ServiceThread. >> It is used by the gtest for delete_empty_blocks, but I think will never be >> true as used there. >> >> We could change ServiceThread to pay attention to the result, but that would >> make that code a bit more complicated, and because of how the "work limit" >> works I don't think there's much benefit to that. There would be if the "work >> limit" were some arbitrary fixed count, like 10 blocks or something, but since >> it's not... >> >> So I think just further updating the comment is sufficient. I think just keep >> the first sentence ("Exceeded ... last block.") and delete the rest, about >> making the ServiceThread loop. >> >> I might file a new RFE to do something useful with that bool result or >> eliminate it. > > But we still want to have record_needs_cleanup() right? So that the ServiceThread will find work to do on the next iteration, but it's true that it wont *cause* the service thread to loop. Yes, or at least not immediately loop, because of the deferral period. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558101213 From cslucas at openjdk.org Tue Apr 9 19:10:30 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 9 Apr 2024 19:10:30 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v12] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix ARM32 AD file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16484/files - new: https://git.openjdk.org/jdk/pull/16484/files/b4d73c98..693c7ef8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=10-11 Stats: 21 lines in 1 file changed: 12 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From cslucas at openjdk.org Tue Apr 9 19:10:33 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 9 Apr 2024 19:10:33 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 06:09:14 GMT, Boris Ulasevich wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Fix AArch64 build & improve comment about InstructionMark >> - Catching up with changes in master >> - Catching up with origin/master >> - Catch up with origin/master >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 > > Do you need help understanding the problem? The crash occurred because you removed the line `fprintf(fp, " cbuf.set_insts_mark();\n");` from the generator of AD nodes ::emit() methods. That is why emit_call_reloc finds cbuf.insts->_mark unitialized. > > > // Call Runtime Instruction > instruct CallRuntimeDirect(method meth) %{ > match(CallRuntime); > effect(USE meth); > ins_cost(CALL_COST); > format %{ "CALL,runtime" %} > ins_encode( Java_To_Runtime( meth ), > call_epilog ); > ins_pipe(simple_call); > %} > > --> > > void CallRuntimeDirectNode::emit(CodeBuffer& cbuf, PhaseRegAlloc* ra_) const { > cbuf.set_insts_mark(); > // Start at oper_input_base() and count operands > unsigned idx0 = 1; > unsigned idx1 = 1; // > { > #line 1217 "/home/boris/jdk-bulasevich/src/hotspot/cpu/arm/arm.ad" > // CALL directly to the runtime > emit_call_reloc(cbuf, as_MachCall(), opnd_array(1), runtime_call_Relocation::spec()); > #line 999999 > } > { > #line 1213 "/home/boris/jdk-bulasevich/src/hotspot/cpu/arm/arm.ad" > // nothing > #line 999999 > } > } @bulasevich - I just pushed a fix for ARM32. Can you please run your tests again? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2045889314 From kbarrett at openjdk.org Tue Apr 9 19:24:11 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 9 Apr 2024 19:24:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 16:14:12 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > version check not needed anymore Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/byteswap.hpp line 2: > 1: /* > 2: * Copyright (c) 2023, Google and/or its affiliates. All rights reserved. Don't drop the creation year. src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 36: > 34: #if defined(_AIX) > 35: #include > 36: #endif I would much rather see this include added in the few places it was actually needed, rather than being added here. ------------- PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1989864573 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558124034 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558172309 From lmesnik at openjdk.org Tue Apr 9 19:51:00 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 9 Apr 2024 19:51:00 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 15:28:41 GMT, Serguei Spitsyn wrote: > The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. > > Testing: > - Ran mach5 tiers 1-6 The fix looks good. It would be better to either rename doit methods to something more specific or even to move code into do_thread and do_vthread. And make do_vthread like void do_vthread(Handle target_h) { if (_target_jt != nullptr) { do_thread(_target_jt); } else { < code for unmounted > } } ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18630#pullrequestreview-1990029643 From mikael at openjdk.org Tue Apr 9 20:13:01 2024 From: mikael at openjdk.org (Mikael Vidstedt) Date: Tue, 9 Apr 2024 20:13:01 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Fri, 5 Apr 2024 12:17:17 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - disable unused-function warnings; add log msg > - minor Thank you for the update and for working on this in general. I've started working on JDK-8329816, preparing the change for the SLEEF specific part of the change. Specifically, I'm currently planning on including the three SLEEF header files, the README and a legal/sleef.md file in that change. Let me know if you have any thoughts/concerns. Also, just for my understanding, would love to understand your thoughts on the future here (I apologize if this was already discussed elsewhere): It seem like SLEEF is (sort of) limited to linux at this point (the SLEEF README mentions that "Due to limited test capacities, SLEEF is currently only officially supported on Linux with gcc or llvm/clang." ). That same README does, however, indicate good test coverage on several architectures in addition to aarch64 (including x86_64, PPC, RISC-V). With that in mind, it looks like we could potentially use SLEEF for other architectures on linux in the future? And potentially additional operating systems as well? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2045972249 From pchilanomate at openjdk.org Tue Apr 9 20:16:28 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 9 Apr 2024 20:16:28 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: use WhiteBox to verify OSR compilation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18637/files - new: https://git.openjdk.org/jdk/pull/18637/files/b35306f8..ab275358 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=01-02 Stats: 32 lines in 1 file changed: 21 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/18637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18637/head:pull/18637 PR: https://git.openjdk.org/jdk/pull/18637 From pchilanomate at openjdk.org Tue Apr 9 20:47:02 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 9 Apr 2024 20:47:02 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 00:58:24 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Avoid unnecessary store when count was already zero. Looks good to me, thanks for working this one out David. src/hotspot/share/runtime/javaThread.cpp line 929: > 927: } > 928: > 929: if (CheckJNICalls && jni_monitor_count() > 0) { Now that we have the else branch above, shouldn't we move this conditional there? Since for the detach mode case we have already released all monitors. test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 117: > 115: // The following is a hack to trick the pool worker threads into terminating > 116: // after one second (default keep-alive / parallelism). > 117: "-Djdk.virtualThreadScheduler.parallelism=30", Leftover from old testing? test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 202: > 200: > 201: // This gives us a way to control the scheduler used for our virtual threads. The test > 202: // only works as intended then the virtual threads run on the same carrier thread (as Nit: s/then/when test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 203: > 201: // This gives us a way to control the scheduler used for our virtual threads. The test > 202: // only works as intended then the virtual threads run on the same carrier thread (as > 203: // that carrier maintains ownership of the monitor if the virtual thread fails to unlock it. Nit: Missing ')'. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1990087917 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558247423 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558250398 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558251160 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558251437 From dholmes at openjdk.org Tue Apr 9 21:42:12 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Apr 2024 21:42:12 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 20:27:23 GMT, Patricio Chilano Mateo wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Avoid unnecessary store when count was already zero. > > src/hotspot/share/runtime/javaThread.cpp line 929: > >> 927: } >> 928: >> 929: if (CheckJNICalls && jni_monitor_count() > 0) { > > Now that we have the else branch above, shouldn't we move this conditional there? Since for the detach mode case we have already released all monitors. I see what you mean, but I will keep it here just in case something has gone wrong in a release build. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558326399 From dholmes at openjdk.org Tue Apr 9 21:49:17 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Apr 2024 21:49:17 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 20:44:53 GMT, Patricio Chilano Mateo wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Avoid unnecessary store when count was already zero. > > Looks good to me, thanks for working this one out David. Thanks for the review @pchilano . > test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 117: > >> 115: // The following is a hack to trick the pool worker threads into terminating >> 116: // after one second (default keep-alive / parallelism). >> 117: "-Djdk.virtualThreadScheduler.parallelism=30", > > Leftover from old testing? Yep - well spotted. Fixed. Also fixed incorrect comment about the add-opens. > test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 202: > >> 200: >> 201: // This gives us a way to control the scheduler used for our virtual threads. The test >> 202: // only works as intended then the virtual threads run on the same carrier thread (as > > Nit: s/then/when Fixed > test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java line 203: > >> 201: // This gives us a way to control the scheduler used for our virtual threads. The test >> 202: // only works as intended then the virtual threads run on the same carrier thread (as >> 203: // that carrier maintains ownership of the monitor if the virtual thread fails to unlock it. > > Nit: Missing ')'. Fixed ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2046101799 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558332614 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558333784 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1558333911 From dholmes at openjdk.org Tue Apr 9 22:00:22 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 9 Apr 2024 22:00:22 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks David Holmes has updated the pull request incrementally with one additional commit since the last revision: Cleanup test leftovers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18445/files - new: https://git.openjdk.org/jdk/pull/18445/files/5891800b..70f43301 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=03-04 Stats: 6 lines in 1 file changed: 0 ins; 3 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18445.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18445/head:pull/18445 PR: https://git.openjdk.org/jdk/pull/18445 From rpressler at openjdk.org Tue Apr 9 22:22:10 2024 From: rpressler at openjdk.org (Ron Pressler) Date: Tue, 9 Apr 2024 22:22:10 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 20:16:28 GMT, Patricio Chilano Mateo wrote: >> Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). >> >> This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. >> >> When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. >> >> I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. >> >> The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. >> The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. >> >> I tested the current patch with the new test and also run it through mach5 tiers1-6. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > use WhiteBox to verify OSR compilation Marked as reviewed by rpressler (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18637#pullrequestreview-1990243343 From sspitsyn at openjdk.org Tue Apr 9 22:26:08 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 9 Apr 2024 22:26:08 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake In-Reply-To: References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: On Tue, 9 Apr 2024 01:18:35 GMT, Leonid Mesnik wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> >> Ran mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnvBase.hpp line 503: > >> 501: _value(value), >> 502: _tos(tos) {} >> 503: void doit(Thread *target, bool self); > > No need to use self, you might use _self from doit(). Good suggestion, thanks. The UpdateForPopTopFrameClosure::doit has the same issue. Fixed both now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18570#discussion_r1558367607 From sspitsyn at openjdk.org Tue Apr 9 22:31:10 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 9 Apr 2024 22:31:10 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake In-Reply-To: References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: On Tue, 9 Apr 2024 01:19:18 GMT, Leonid Mesnik wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> >> Ran mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnvBase.hpp line 508: > >> 506: } >> 507: void do_vthread(Handle target_h) { >> 508: assert(_target_jt != nullptr, "sanity check"); > > Better to test that target_h is same as _target_jt. Thanks. The `_target_jt` is a `JavaThread*` while the `target_h` is a handle of thread oop. Added the following assert: `assert(_target_jt->vthread() == target_h(), "sanity check");` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18570#discussion_r1558372501 From sgibbons at openjdk.org Tue Apr 9 23:48:09 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 9 Apr 2024 23:48:09 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v6] In-Reply-To: <2X2qG_TCmbIfhM4CCepi7PHttQGFuMXlLgea1Yq15uc=.3d4bdee1-2eed-4df9-bcb4-f08bf8060119@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <9BH6kkaQU5kSjlJUnNenUeWBK2EdahCuks8qEUjDlv0=.b8979589-32df-4fa3-b5a6-f56dad76c58d@github.com> <2X2qG_TCmbIfhM4CCepi7PHttQGFuMXlLgea1Yq15uc=.3d4bdee1-2eed-4df9-bcb4-f08bf8060119@github.com> Message-ID: On Sun, 7 Apr 2024 05:14:08 GMT, Francesco Nigro wrote: >> I went ahead and tried a pure-Java implementation, and it is faster for small sizes (up to 8) and only about 1.5x slower for larger sizes, so that might make for an interesting fallback if there is no customized assembler implementation available or if the size is known to me small. >> >> Ideally, I think we would want C2 to be more aware of setMemory stores, so that it can remove redundant stores, like it does with InitializeNode. > > @dean-long in my old PR I have done the same, choosing a (not yet) configurable cutoff value. > > See https://github.com/openjdk/jdk/pull/16760 As an experiment I added the java code that @franz1981 supplied and ran performance vs. the intrinsic stub. I used 128 bytes as the cutoff value as in that code. I saw about 0.75 to 1ns improvement for sizes of 1 or 2 bytes only. Anything larger and the stub performed better. @mcimadamore Is there any way to disable some of the optimizations C2 will attempt on the IR? We need to maintain atomicity, so vectorization shouldn't occur, for instance. This seems like a rat-hole that would need constant maintenance as C2 optimizations get better. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2046208254 From kbarrett at openjdk.org Wed Apr 10 00:54:17 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Apr 2024 00:54:17 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 19:20:22 GMT, Kim Barrett wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> version check not needed anymore > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 36: > >> 34: #if defined(_AIX) >> 35: #include >> 36: #endif > > I would much rather see this include added in the few places it was actually needed, rather than being > added here. Do we even need to include ? >From the Linux man page for alloca: By necessity, alloca() is a compiler built-in, also known as __builtin_alloca(). By default, modern compilers automatically translate all uses of alloca() into the built-in, but this is forbidden if standards conformance is requested (-ansi, -std=c*), in which case is required, lest a symbol dependency be emitted. There are uses of it in shared code where there isn't an applicable include, other than from globalDefinitions_xlc.hpp. So it appears all other supported compilers do treat it as a built-in with the options we are providing, and don't need the include. Maybe that's true for the new xlc compiler too? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1558565268 From dlong at openjdk.org Wed Apr 10 01:52:09 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Apr 2024 01:52:09 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> Message-ID: On Mon, 8 Apr 2024 14:02:42 GMT, Coleen Phillimore wrote: >> Hmm. Having two `array_klass` calls were not intentional. I accepted Dean's suggestion in the GitHub UI, but that didn't remove the old `array_klass`. I think I'll revert that change given that it is not important to devirtualize this. > > Oh good because I was going to need a lot more coffee to understand why there was a second call. Thanks. Why devirtualize elsehwere but not here? Maybe it's not a big deal. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1558671690 From dlong at openjdk.org Wed Apr 10 01:59:10 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Apr 2024 01:59:10 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 13:46:32 GMT, Stefan Karlsson wrote: >> We have various functions in Universe that returns Klass* where they could be returning TypeArrayKlass* and ObjArrayKlass* instead. If we change these functions we could get rid of some casts in the code. Does this seem like a reasonable change? > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Revert "Update src/hotspot/share/classfile/systemDictionary.cpp > " > > This reverts commit d36f650dc3bf9729cd8bd138d23bef3dfdb8e4d2. > Thanks for the reviews! Dean, I reverted the suggestion to go with the typed TypeArrayKlass given that it had no visible effects on inlining. If you still want it I can fix it in separate commit. /integrate If there's no benefit, then it would just be for consistency with the other changes. It's not a big deal though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18652#issuecomment-2046318919 From dlong at openjdk.org Wed Apr 10 01:59:11 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Apr 2024 01:59:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v3] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 07:32:58 GMT, Stefan Karlsson wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 371: >> >>> 369: } else { >>> 370: k = Universe::typeArrayKlass(t); >>> 371: k = k->array_klass(ndims, CHECK_NULL); >> >> I assume the cast was an attempt to de-virtualize the array_klass() call, so it is better not to use Klass* here. > > My experience is that these type of casts doesn't make the compiler devirtualize the calls. I tried it now and verified that both with and without the cast we still get the virtual call. You typically need to tell the compiler what function it should be using. (I played around a lot with this when writing the devirtualization layer for the oop_iterate/OopIterateClosure code.) > > I tested writing the code above as `TypeArrayKlass::cast(k)->TypeArrayKlass::array_klass(ndims, CHECK_NULL)` and that gets rid of the virtual call. However, the compiler still can't inline the code ArrayKlass::array_klass code because it is inside a .cpp file and not an .inline.hpp, so this results in a direct call instead of inlined code. OK, I guess the compiler needs to be conservative in case TypeArrayKlass has a subclass. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1558677084 From dlong at openjdk.org Wed Apr 10 01:59:11 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Apr 2024 01:59:11 GMT Subject: RFR: 8329750: Change Universe functions to return more specific Klass* types [v2] In-Reply-To: References: <1AATJlet-Nur9h4V9L8OPk7PDQkGSvCE8P2UAZjFgl8=.48823213-7b50-4b12-a900-ab09236b68ee@github.com> Message-ID: On Wed, 10 Apr 2024 01:49:08 GMT, Dean Long wrote: >> Oh good because I was going to need a lot more coffee to understand why there was a second call. Thanks. > > Why devirtualize elsehwere but not here? Maybe it's not a big deal. Or to put it another way, what's the advantage of using `TypeArrayKlass* tak` in similar situations below? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18652#discussion_r1558679785 From sspitsyn at openjdk.org Wed Apr 10 02:34:37 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 02:34:37 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake [v2] In-Reply-To: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: <4pX8bccxXZCq1XNpmOpjY4fRQ6G9TZiv_BYTlw6hIxU=.9c9fda52-ce07-40d3-9528-37c140986fe1@github.com> > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > > Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: remove self from args; add asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18570/files - new: https://git.openjdk.org/jdk/pull/18570/files/9ca1ea20..d46813da Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18570&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18570&range=00-01 Stats: 10 lines in 2 files changed: 2 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18570.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18570/head:pull/18570 PR: https://git.openjdk.org/jdk/pull/18570 From dholmes at openjdk.org Wed Apr 10 02:44:10 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Apr 2024 02:44:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> Message-ID: On Tue, 9 Apr 2024 17:32:07 GMT, Coleen Phillimore wrote: >> Here, if needs_cleanup_requested was true and then we set it to false, we still return true so a concurrent thread setting it to true is ok because we still return 'true'. That is, we need to do some work. If we return false here, while another thread is setting it to true, we'll get it on the next periodic timeout. >> The only reason for the store-release is because we use load-acquire and it's consistent. The comment refers to the old behaviour which was more complicated. I removed the comment to remove the confusion. > > Ok, I think I restated what Kim said. > The only reason for the store-release is because we use load-acquire and it's consistent. So why do we use load-acquire? That is typically to pair with a store-release. We only need release semantics if someone seeing this store must see previous stores as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558740137 From sspitsyn at openjdk.org Wed Apr 10 02:48:09 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 02:48:09 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v2] In-Reply-To: References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Tue, 9 Apr 2024 00:56:16 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: cleanup - removed temporary logging used for debugging > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 2070: > >> 2068: void >> 2069: GetSingleStackTraceClosure::do_thread(Thread *target) { >> 2070: doit(); > > I think it makes sense to check that the target is the same as _target_jt. So we don't call it with arbitrary threads. > or require parameter to be null if you want. > Same for do_vthread. Thank you for suggestion. Added asserts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18574#discussion_r1558748415 From sspitsyn at openjdk.org Wed Apr 10 03:17:32 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 03:17:32 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v3] In-Reply-To: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: add some asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18574/files - new: https://git.openjdk.org/jdk/pull/18574/files/8f048d34..3c555b84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18574&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18574&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18574.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18574/head:pull/18574 PR: https://git.openjdk.org/jdk/pull/18574 From amitkumar at openjdk.org Wed Apr 10 03:36:13 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 10 Apr 2024 03:36:13 GMT Subject: RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 10:38:24 GMT, Martin Doerr wrote: >> Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: >> >> suggestion from Lutz > > I couldn't spot any bugs. Thanks @TheRealMDoerr and @RealLucy for Review. I ran the test again with `-XX:-TieredCompilation` flag and do not see any new failure there as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17975#issuecomment-2046482058 From amitkumar at openjdk.org Wed Apr 10 03:36:13 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 10 Apr 2024 03:36:13 GMT Subject: Integrated: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 23 Feb 2024 05:23:29 GMT, Amit Kumar wrote: > s390 implementation of [JDK-8277180](https://bugs.openjdk.org/browse/JDK-8277180). PPC implementation for the same: https://github.com/openjdk/jdk/pull/7305 > > I had tested `tier1` on `fastdebug`, `release` vm. > > BenchMarking: > > > ./build/linux-s390x-server-release/jdk/bin/java -Xms4g -Xmx4g -jar dacapo-9.12-MR1-bach.jar h2 -s huge -t 1 -n 1 > > without patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 223023 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 225686 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 219824 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 226719 msec ===== > > > > with patch: > ===== DaCapo 9.12-MR1 h2 PASSED in 167816 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 174368 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 170517 msec ===== > ===== DaCapo 9.12-MR1 h2 PASSED in 169349 msec ===== This pull request has now been integrated. Changeset: 47df1459 Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/47df14590c003ccb1607ec0edfe999fcf2aebd86 Stats: 78 lines in 1 file changed: 32 ins; 14 del; 32 mod 8310513: [s390x] Intrinsify recursive ObjectMonitor locking Reviewed-by: lucy, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/17975 From sspitsyn at openjdk.org Wed Apr 10 03:43:08 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 03:43:08 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 15:28:41 GMT, Serguei Spitsyn wrote: > The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. > > Testing: > - Ran mach5 tiers 1-6 Thank you for review, Leonid! Refactored code as you suggested. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18630#issuecomment-2046487125 From iklam at openjdk.org Wed Apr 10 04:02:52 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Apr 2024 04:02:52 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v3] In-Reply-To: References: Message-ID: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. > > Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Check class name for valid UTF8 encoding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18669/files - new: https://git.openjdk.org/jdk/pull/18669/files/034c29b8..05afb6ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=01-02 Stats: 32 lines in 3 files changed: 26 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 PR: https://git.openjdk.org/jdk/pull/18669 From sspitsyn at openjdk.org Wed Apr 10 04:21:23 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 04:21:23 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: References: Message-ID: > The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: refactored to get rid of overloaded doit functions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18630/files - new: https://git.openjdk.org/jdk/pull/18630/files/6071446f..39717f37 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=00-01 Stats: 17 lines in 1 file changed: 5 ins; 10 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18630.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18630/head:pull/18630 PR: https://git.openjdk.org/jdk/pull/18630 From dholmes at openjdk.org Wed Apr 10 05:19:09 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Apr 2024 05:19:09 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v3] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 04:02:52 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Check class name for valid UTF8 encoding A few suggestions and a few overflow issues that need fixing. Thanks. src/hotspot/share/cds/classListParser.cpp line 179: > 177: _token = nullptr; > 178: _line_len = 0; > 179: error("Input line too long"); // will exit JVM Can you print the line number and/or line length? src/hotspot/share/cds/classListParser.cpp line 465: > 463: err = "class name too long"; > 464: } else { > 465: int len = (int)strlen(class_name); Suggestion: save the length first so you don't have to calculate it twice. src/hotspot/share/utilities/lineReader.cpp line 41: > 39: _buffer_len = 0; > 40: init(file); > 41: } Style nit: Shouldn't we be using initializer lists rather than the constructor body for these simple initializations? src/hotspot/share/utilities/lineReader.cpp line 90: > 88: // _buffer_len will stop at MAX_LEN, so we will never be able to read more than > 89: // MAX_LEN chars for a single input line. > 90: assert(line_len >= 0 && new_len >= 0 && (line_len + new_len) >= 0, "no int overflow"); The overflow test is relying on UB you need to check subtraction from INT_MAX src/hotspot/share/utilities/lineReader.cpp line 110: > 108: return _buffer; > 109: } > 110: int new_len = _buffer_len * 2; Again UB on the overflow check. MAX_LEN should be set so that doubling of the current size will always hit max_len so you can't overflow. src/hotspot/share/utilities/lineReader.hpp line 36: > 34: // MAX_LEN is currently 4M. This should be enough for any practical use > 35: // of text-based input files for HotSpot. Don't use LineReader if it's > 36: // possible for valid lines to be longer than this limit. Should be easy enough to make this configurable if needed in the future. src/hotspot/share/utilities/lineReader.hpp line 37: > 35: // of text-based input files for HotSpot. Don't use LineReader if it's > 36: // possible for valid lines to be longer than this limit. > 37: class LineReader : public StackObj { Suggestion: LineReader should also track the line count. src/hotspot/share/utilities/lineReader.hpp line 61: > 59: // When successful, a non-null value is returned. The caller is free to read or modify this > 60: // string (up to the terminating \0 character) until the next call to read_line(), or until the > 61: // LineReader is destructed. s/destructed/destroyed/ ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18669#pullrequestreview-1990789505 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558838077 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558839884 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558841905 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558849350 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558851648 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558853079 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558853376 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558853942 From eosterlund at openjdk.org Wed Apr 10 05:22:00 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 10 Apr 2024 05:22:00 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> Message-ID: <26iDzUx03Fbd_daAG42xdf32HuHIGidWU5RqM_aumQ4=.d8475739-1e00-4a51-919f-87888100e957@github.com> On Wed, 10 Apr 2024 02:41:01 GMT, David Holmes wrote: >> Ok, I think I restated what Kim said. > >> The only reason for the store-release is because we use load-acquire and it's consistent. > > So why do we use load-acquire? That is typically to pair with a store-release. We only need release semantics if someone seeing this store must see previous stores as well. Ah, okay. Then this seems fine. Maybe tweak or remove the comment to reflect this conversation. It seems to be saying the opposite of what it is doing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1558857602 From stefank at openjdk.org Wed Apr 10 05:34:09 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 10 Apr 2024 05:34:09 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: <3oKW4EY92lYqg3_IPS3HmYN-rPr6SH12u8RnAUBiHjo=.8922bd3f-aaa5-42fa-90a9-5362b457f394@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> <3oKW4EY92lYqg3_IPS3HmYN-rPr6SH12u8RnAUBiHjo=.8922bd3f-aaa5-42fa-90a9-5362b457f394@github.com> Message-ID: On Tue, 9 Apr 2024 16:17:33 GMT, Lutz Schmidt wrote: > LGTM - given the scope of this PR. In general, I don't like the event log to be split into multiple streams being printed separately. Yes, separate sections prevent displacement of events by other, too verbose, events. On the other hand, time coherence is lost or has to be manually re-established by the support engineer. Often enough, an issue can only be understood when seeing multiple/all events in timely order. > > Merging the event sections at print time by timestamp would be a helpful enhancement. FWIW, I think I am of the opposite opinion. I find it very helpful to have the events separated into distinct sections and wouldn't want them all combined into one big section. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2046567845 From ccheung at openjdk.org Wed Apr 10 05:40:09 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 10 Apr 2024 05:40:09 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v3] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 04:02:52 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > Check class name for valid UTF8 encoding Updates look good. One suggestion below. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18669#pullrequestreview-1990799317 From ccheung at openjdk.org Wed Apr 10 05:40:09 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 10 Apr 2024 05:40:09 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 17:22:47 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K src/hotspot/share/utilities/lineReader.hpp line 43: > 41: bool _is_oom; > 42: public: > 43: static const int MAX_LEN = 4 * 1024 * 1024; How about `4 * M` instead of `4 * 1024 * 1024`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1558847183 From bulasevich at openjdk.org Wed Apr 10 06:32:13 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 10 Apr 2024 06:32:13 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v12] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Mon, 8 Apr 2024 06:46:28 GMT, Tobias Hartmann wrote: > Looks good to me. Thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17244#issuecomment-2046623358 From bulasevich at openjdk.org Wed Apr 10 06:32:13 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 10 Apr 2024 06:32:13 GMT Subject: Integrated: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: On Wed, 3 Jan 2024 14:19:10 GMT, Boris Ulasevich wrote: > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. This pull request has now been integrated. Changeset: d037a597 Author: Boris Ulasevich URL: https://git.openjdk.org/jdk/commit/d037a597a94edf6e716098b88f42f2b15518e2bd Stats: 325 lines in 5 files changed: 175 ins; 99 del; 51 mod 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments Reviewed-by: thartmann ------------- PR: https://git.openjdk.org/jdk/pull/17244 From dlong at openjdk.org Wed Apr 10 06:34:09 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Apr 2024 06:34:09 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames In-Reply-To: References: Message-ID: <1uoXTvN0rXR3T_34lROg1IL_hbhkcnew0gqMm0N0D1U=.f78a51ee-aeae-45ce-adcd-4365e89357ea@github.com> On Mon, 8 Apr 2024 14:12:45 GMT, Patricio Chilano Mateo wrote: > Since this is used in the thaw fast path too I wanted the avoid the extra load of constMethod if possible, but I think either case is fine. Moving _is_unlinked to where the other booleans are defined actually keeps the size of the nmethod same as before (368 bytes). What do you think? Can you do a performance measurement to see if the extra load actually makes a difference. I think @vnkozlov is also doing nmethod field reordering/compaction, so the relative overhead of an extra field might not remain 0. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2046627106 From mbaesken at openjdk.org Wed Apr 10 07:14:05 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 10 Apr 2024 07:14:05 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Tue, 9 Apr 2024 07:31:23 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > adjust typo Hi Lucy, thanks for the review ! May I have a second one please ? > FWIW, I think I am of the opposite opinion. I find it very helpful to have the events separated into distinct sections and wouldn't > want them all combined into one big section. Yeah, both approaches (one single big log, or multiple ones) have pros and cons. But having multiple logs is a long established HS approach so I do not think that it is in scope of this PR to change the established approach. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2046684451 From rehn at openjdk.org Wed Apr 10 07:28:10 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 10 Apr 2024 07:28:10 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <5Wl7pdAZc7DCkDifVTrrMuPIrRJ1uoCgqdaM2HJXZlQ=.7e43d9bf-8f37-4d2b-99c7-bae522cfaeab@github.com> On Tue, 9 Apr 2024 22:00:22 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup test leftovers Marked as reviewed by rehn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1990970659 From aboldtch at openjdk.org Wed Apr 10 07:59:19 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 10 Apr 2024 07:59:19 GMT Subject: RFR: 8326957: Implementation of JEP 474: ZGC: Generational Mode by Default Message-ID: This is the implementation task for `JEP 474: ZGC: Generational Mode by Default`. See the JEP for details. [JDK-8326667](https://bugs.openjdk.org/browse/JDK-8326667) ------------- Commit messages: - Merge tag 'jdk-23+17' into JDK-8326957 - Merge tag 'jdk-23+16' into JDK-8326957 - Update VMDeprecatedOptions.java test - 8326957: Implementation of Deprecate Non-Generational ZGC Changes: https://git.openjdk.org/jdk/pull/18393/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18393&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8326957 Stats: 107 lines in 7 files changed: 105 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18393.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18393/head:pull/18393 PR: https://git.openjdk.org/jdk/pull/18393 From duke at openjdk.org Wed Apr 10 08:36:36 2024 From: duke at openjdk.org (kuaiwei) Date: Wed, 10 Apr 2024 08:36:36 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v4] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request incrementally with one additional commit since the last revision: Simplify code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18467/files - new: https://git.openjdk.org/jdk/pull/18467/files/fe4f4f20..1a49c60c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=02-03 Stats: 161 lines in 14 files changed: 12 ins; 82 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From rrich at openjdk.org Wed Apr 10 09:04:04 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 10 Apr 2024 09:04:04 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 22:00:22 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup test leftovers Testing didn't bring up any issue. I've only got a few minor comments. Thanks, Richard. src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1: > 1: /* Copyright header needs update src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1673: > 1671: // Save return value potentially containing the exception oop > 1672: Register ex_oop = R15_esp; // nonvolatile register > 1673: __ mr(ex_oop, R3_RET); Please add `R15_esp` to the Kills section above in the header comment. src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1675: > 1673: __ mr(ex_oop, R3_RET); > 1674: __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::log_jni_monitor_still_held)); > 1675: // Restore potentional return value Nit Suggestion: // Restore potential return value src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1680: > 1678: // For vthreads we have to explicitly zero the JNI monitor count of the carrier > 1679: // on termination. The held count is implicitly zeroed below when we restore from > 1680: // the parent held count (which has to be zero). This comment is not quite correct or a little imprecise, I found. > the parent held count (which has to be zero) I think technically the held count (_held_monitor_count) could be non-zero. Of course it would likely be bad if it was (holding a monitor while suspended is usually not good). I thought it was like this: The JNI monitor count of the carrier thread is required to be zero when switching to the virtual thread. Here we are switching back to the carrier. We have to restore its JNI monitor count of zero. ------------- PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1991008592 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1559081058 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1559078219 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1559002331 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1559059557 From jkern at openjdk.org Wed Apr 10 09:20:12 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 09:20:12 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 16:59:39 GMT, Thomas Stuefe wrote: >> Hi Thomas, `maxDisclaimSize` is of type `unsigned int`; therefore I get the following warning: >> >> os/aix/os_aix.cpp:314:42: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] >> RANGEFMTARGS(p, maxDisclaimSize), >> ^~~~~~~~~~~~~~~ >> >> Should I keep the casts, or change the type of `maxDisclaimSize, numFullDisclaimsNeeded, lastDisclaimSize` to `const unsigned long`? > > I would change them to size_t. Thanks for doing this. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559121239 From lucy at openjdk.org Wed Apr 10 09:26:10 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 10 Apr 2024 09:26:10 GMT Subject: RFR: JDK-8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: <7xYVisoX6rnGkOTHxbtz93HIwc_b2FA9ShFrsYY-YH4=.5c217550-946a-4eee-ba97-a4c3dc1ff6ff@github.com> On Wed, 10 Apr 2024 07:10:55 GMT, Matthias Baesken wrote: > ... I do not think that it is in scope of this PR ... Oh no, I didn't want to suggest to do such a change in the scope of this PR. And yes, I agree, there are reasons why you would want separate streams. As always, a solution that suits all needs is hard to find. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2047007769 From jkern at openjdk.org Wed Apr 10 09:26:14 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 09:26:14 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <60xqHKyBKIqrzMqVisUO5M_lQLCNt7OYZ6XcovISOc0=.f4bc36e0-c3f7-4e40-b6b7-69ed46ca37e8@github.com> References: <60xqHKyBKIqrzMqVisUO5M_lQLCNt7OYZ6XcovISOc0=.f4bc36e0-c3f7-4e40-b6b7-69ed46ca37e8@github.com> Message-ID: On Tue, 9 Apr 2024 17:00:56 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> version check not needed anymore > > src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp line 440: > >> 438: st->print("pc =" INTPTR_FORMAT " ", (unsigned long)uc->uc_mcontext.jmp_context.iar); >> 439: st->print("lr =" INTPTR_FORMAT " ", (unsigned long)uc->uc_mcontext.jmp_context.lr); >> 440: st->print("ctr=" INTPTR_FORMAT " ", (unsigned long)uc->uc_mcontext.jmp_context.ctr); > > p2i I had tried this, but got following error: .../src/hotspot/os_cpu/aix_ppc/os_aix_ppc.cpp:438:40: error: no matching function for call to 'p2i' st->print("pc =" INTPTR_FORMAT " ", p2i(uc->uc_mcontext.jmp_context.iar)); ^~~ .../src/hotspot/share/utilities/globalDefinitions.hpp:179:17: note: candidate function not viable: no known conversion from 'const unsigned long long' to 'const volatile void *' for 1st argument; take the address of the argument with & inline intptr_t p2i(const volatile void* p) { ^ .../src/hotspot/share/oops/oopsHierarchy.hpp:169:17: note: candidate function not viable: no known conversion from 'const unsigned long long' to 'narrowOop' for 1st argument inline intptr_t p2i(narrowOop o) { ^ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559128609 From mli at openjdk.org Wed Apr 10 09:29:05 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 10 Apr 2024 09:29:05 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Tue, 9 Apr 2024 20:10:36 GMT, Mikael Vidstedt wrote: > Thank you for the update and for working on this in general. > > I've started working on JDK-8329816, preparing the change for the SLEEF specific part of the change. Specifically, I'm currently planning on including the three SLEEF header files, the README and a legal/sleef.md file in that change. Let me know if you have any thoughts/concerns. > Thanks a lot, that's a great news. Please go ahead to integrate the files via JDK-8329816. :) Besides of the performance issue currently found out, I have no other concerns. > Also, just for my understanding, would love to understand your thoughts on the future here (I apologize if this was already discussed elsewhere): > > It seem like SLEEF is (sort of) limited to linux at this point (the SLEEF README mentions that "Due to limited test capacities, SLEEF is currently only officially supported on Linux with gcc or llvm/clang." ). That same README does, however, indicate good test coverage on several architectures in addition to aarch64 (including x86_64, PPC, RISC-V). With that in mind, it looks like we could potentially use SLEEF for other architectures on linux in the future? And potentially additional operating systems as well? There are more informantion at https://sleef.org/compile.xhtml, seems it could be formally supported in the future, but I'm not sure about it. Maybe others have more information could help to comment here. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2047008550 From jkern at openjdk.org Wed Apr 10 09:36:10 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 09:36:10 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 18:32:04 GMT, Kim Barrett wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> version check not needed anymore > > src/hotspot/share/utilities/byteswap.hpp line 2: > >> 1: /* >> 2: * Copyright (c) 2023, Google and/or its affiliates. All rights reserved. > > Don't drop the creation year. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559142128 From jkern at openjdk.org Wed Apr 10 09:43:11 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 09:43:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 00:51:22 GMT, Kim Barrett wrote: >> src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 36: >> >>> 34: #if defined(_AIX) >>> 35: #include >>> 36: #endif >> >> I would much rather see this include added in the few places it was actually needed, rather than being >> added here. > > Do we even need to include ? > > From the Linux man page for alloca: > > By necessity, alloca() is a compiler built-in, also known as > __builtin_alloca(). By default, modern compilers automatically > translate all uses of alloca() into the built-in, but this is > forbidden if standards conformance is requested (-ansi, -std=c*), > in which case is required, lest a symbol dependency be > emitted. > > There are uses of it in shared code where there isn't an applicable include, > other than from globalDefinitions_xlc.hpp. So it appears all other supported > compilers do treat it as a built-in with the options we are providing, and > don't need the include. Maybe that's true for the new xlc compiler too? If I omit this #include I get compiler errors of the following kind .../src/hotspot/share/runtime/javaThread.cpp:2222:24: error: use of undeclared identifier 'alloca' char* p1 = (char*) alloca(1); ^ Of course I can do this include in every nagging file, but I thought it is simpler to keep it in the central header. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559150964 From rehn at openjdk.org Wed Apr 10 09:48:16 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 10 Apr 2024 09:48:16 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: <33a-yu0_4j-ciPcJ_NtgbjHIrganBjKktC3REG8nyOc=.ace97c4f-b738-4049-a92b-60d1e8584e49@github.com> On Wed, 10 Apr 2024 09:24:09 GMT, Hamlin Li wrote: > With that in mind, it looks like we could potentially use SLEEF for other architectures on linux in the future? And potentially additional operating systems as well? Hi Mikael(@vidmik ) ! :) Thanks for looking into the legal stuff! We are pushing for this as we can leverage these changes when adding sleef to risc-v. Cross-fingers about legal! /Robbin ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2047049521 From jkern at openjdk.org Wed Apr 10 09:55:23 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 09:55:23 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v4] In-Reply-To: References: Message-ID: <8JetMf0mf_mYybUtWzB0YwfDi76Qul9d6x8ge58zklc=.207771b4-e02c-4a5c-85dd-f9cc97942761@github.com> > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: cosmetic changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18536/files - new: https://git.openjdk.org/jdk/pull/18536/files/ac1335e5..815974f5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=02-03 Stats: 6 lines in 2 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From mdoerr at openjdk.org Wed Apr 10 10:03:12 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Apr 2024 10:03:12 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:40:16 GMT, Joachim Kern wrote: >> Do we even need to include ? >> >> From the Linux man page for alloca: >> >> By necessity, alloca() is a compiler built-in, also known as >> __builtin_alloca(). By default, modern compilers automatically >> translate all uses of alloca() into the built-in, but this is >> forbidden if standards conformance is requested (-ansi, -std=c*), >> in which case is required, lest a symbol dependency be >> emitted. >> >> There are uses of it in shared code where there isn't an applicable include, >> other than from globalDefinitions_xlc.hpp. So it appears all other supported >> compilers do treat it as a built-in with the options we are providing, and >> don't need the include. Maybe that's true for the new xlc compiler too? > > If I omit this #include > I get compiler errors of the following kind > > .../src/hotspot/share/runtime/javaThread.cpp:2222:24: error: use of undeclared identifier 'alloca' > char* p1 = (char*) alloca(1); > ^ > > > Of course I can do this include in every nagging file, but I thought it is simpler to keep it in the central header. Is the comment in front of https://github.com/openjdk/jdk/blob/51ed69a586105b707ae616f9eba898449bf9fba7/src/hotspot/os/aix/os_aix.cpp#L28 still correct? Seems like it isn't followed everywhere. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559175426 From mdoerr at openjdk.org Wed Apr 10 10:10:13 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Apr 2024 10:10:13 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: Message-ID: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> On Wed, 10 Apr 2024 10:00:02 GMT, Martin Doerr wrote: >> If I omit this #include >> I get compiler errors of the following kind >> >> .../src/hotspot/share/runtime/javaThread.cpp:2222:24: error: use of undeclared identifier 'alloca' >> char* p1 = (char*) alloca(1); >> ^ >> >> >> Of course I can do this include in every nagging file, but I thought it is simpler to keep it in the central header. > > Is the comment in front of https://github.com/openjdk/jdk/blob/51ed69a586105b707ae616f9eba898449bf9fba7/src/hotspot/os/aix/os_aix.cpp#L28 still correct? Seems like it should get replaced. See https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.1?topic=pragmas-pragma-alloca-c-only Can `-Dalloca=__builtin_alloca` replace `#include `? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559183757 From jkern at openjdk.org Wed Apr 10 10:16:10 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 10:16:10 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 10:07:02 GMT, Martin Doerr wrote: >> Is the comment in front of https://github.com/openjdk/jdk/blob/51ed69a586105b707ae616f9eba898449bf9fba7/src/hotspot/os/aix/os_aix.cpp#L28 still correct? Seems like it should get replaced. See https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.1?topic=pragmas-pragma-alloca-c-only > > Can `-Dalloca=__builtin_alloca` replace `#include `? Yes I believe. I will remove the `#pragma alloca` everywhere, I will remove the `#include ` everywhere and I will add `-Dalloca=__builtin_alloca` to the compile commands. If it works I will update the PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559191851 From jkern at openjdk.org Wed Apr 10 10:46:35 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 10:46:35 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v5] In-Reply-To: References: Message-ID: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge master - cosmetic changes - version check not needed anymore - Followed the proposals - JDK-8329257 ------------- Changes: https://git.openjdk.org/jdk/pull/18536/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=04 Stats: 257 lines in 14 files changed: 11 ins; 208 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From mcimadamore at openjdk.org Wed Apr 10 11:34:10 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 10 Apr 2024 11:34:10 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v6] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <9BH6kkaQU5kSjlJUnNenUeWBK2EdahCuks8qEUjDlv0=.b8979589-32df-4fa3-b5a6-f56dad76c58d@github.com> <2X2qG_TCmbIfhM4CCepi7PHttQGFuMXlLgea1Yq15uc=.3d4bdee1-2eed-4df9-bcb4-f08bf8060119@github.com> Message-ID: On Tue, 9 Apr 2024 23:45:39 GMT, Scott Gibbons wrote: > Is there any way to disable some of the optimizations C2 will attempt on the IR? We need to maintain atomicity, so vectorization shouldn't occur, for instance. This seems like a rat-hole that would need constant maintenance as C2 optimizations get better. Sorry, I do not know that (I'm not a C2 engineer :-) ). One small observation: how important is atomicity in the "full off-heap case" ? E.g. if a `setMemory` is occurring at a location that is provably off-heap (and we should have ways to detect that, we do that also for other unsafe memory access routines), then perhaps the atomicity requirement can go (as I suppose that requirement is there for the Java Memory Model) ? So, perhaps, while we might not be able to fully optimize for on-heap access, we might be able to do so for off-heap access (which is an important case for FFM). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2047288498 From tschatzl at openjdk.org Wed Apr 10 11:35:08 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 Apr 2024 11:35:08 GMT Subject: RFR: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: <6njIMLgexbNu1OjNt9t2CgIkaWQptT7JzhaaJ-lMse4=.ba9ac3f0-0089-4c34-8b2f-79635a253364@github.com> On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18696#pullrequestreview-1991446943 From jkern at openjdk.org Wed Apr 10 11:45:25 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 11:45:25 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v6] In-Reply-To: References: Message-ID: <58IY2j850mbxA8NTbzOfjHfYX5_n9aib_e8amY7MVNo=.673fa8c2-ad22-4384-aef7-f5954e110a94@github.com> > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: replaced pragma alloca and include alloca by compiler define ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18536/files - new: https://git.openjdk.org/jdk/pull/18536/files/302ea6a7..801cfb54 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=04-05 Stats: 8 lines in 3 files changed: 0 ins; 7 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From fbredberg at openjdk.org Wed Apr 10 12:10:13 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 10 Apr 2024 12:10:13 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <9jGVA_1nsHA24Yq-WEVWpF4jie0gFLQQCqcUf2xXW3M=.68e82b07-2c4e-4d48-9806-fac3c6f19cf7@github.com> On Tue, 9 Apr 2024 22:00:22 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Cleanup test leftovers src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1056: > 1054: __ cbz(rscratch1, L_skip_vthread_code); > 1055: > 1056: // Save return value potentially containing the exception oop in callee-saved R19 . Suggestion: // Save return value potentially containing the exception oop in callee-saved R19. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1559322690 From jkern at openjdk.org Wed Apr 10 12:15:34 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 12:15:34 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v7] In-Reply-To: References: Message-ID: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: saver solution ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18536/files - new: https://git.openjdk.org/jdk/pull/18536/files/801cfb54..a8d85924 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From aboldtch at openjdk.org Wed Apr 10 12:16:21 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 10 Apr 2024 12:16:21 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack Message-ID: `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. Tests still running. Tier 1-3 done, Tier4-5 almost done, Tier 6-7 yet to be run. ------------- Commit messages: - Remove whitespace - 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack Changes: https://git.openjdk.org/jdk/pull/18715/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329757 Stats: 146 lines in 4 files changed: 145 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18715.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18715/head:pull/18715 PR: https://git.openjdk.org/jdk/pull/18715 From jesper.wilhelmsson at oracle.com Wed Apr 10 12:24:26 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 10 Apr 2024 12:24:26 +0000 Subject: CFV: New HotSpot Group Member: Afshin Zafari Message-ID: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From jesper.wilhelmsson at oracle.com Wed Apr 10 12:24:27 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 10 Apr 2024 12:24:27 +0000 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg Message-ID: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From thomas.stuefe at gmail.com Wed Apr 10 12:27:15 2024 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 10 Apr 2024 14:27:15 +0200 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: Vote: yes On Wed, Apr 10, 2024 at 2:24?PM Jesper Wilhelmsson < jesper.wilhelmsson at oracle.com> wrote: > I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot > Group. > > Afshin is a Committer in the JDK project, and a member of the Oracle JVM > Runtime team. He has fixed 42 issues including several significant changes > in various parts of the JVM runtime and has lately focused on NMT > improvements. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this > nomination. Votes must be cast in the open by replying to this mailing > list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleenp at openjdk.org Wed Apr 10 13:02:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 10 Apr 2024 13:02:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: <26iDzUx03Fbd_daAG42xdf32HuHIGidWU5RqM_aumQ4=.d8475739-1e00-4a51-919f-87888100e957@github.com> References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> <26iDzUx03Fbd_daAG42xdf32HuHIGidWU5RqM_aumQ4=.d8475739-1e00-4a51-919f-87888100e957@github.com> Message-ID: On Wed, 10 Apr 2024 05:19:24 GMT, Erik ?sterlund wrote: >>> The only reason for the store-release is because we use load-acquire and it's consistent. >> >> So why do we use load-acquire? That is typically to pair with a store-release. We only need release semantics if someone seeing this store must see previous stores as well. > > Ah, okay. Then this seems fine. Maybe tweak or remove the comment to reflect this conversation. It seems to be saying the opposite of what it is doing. The reason we have load_acquire is to match the release_store here. This can be running concurrently with the ServiceThread checking whether cleanup is needed. This first flag says this OopStorage needs cleanup, the second is the global. It seems like the order is important. Therefore, we use all the load_acquire/release_store operations for safety and consistency, and to avoid future head scratching. // Record that cleanup is needed, without notifying the Service thread, because // we can't lock the Service_lock. Used by release(). void OopStorage::record_needs_cleanup() { // Set local flag first, else ServiceThread could wake up and miss // the request. Atomic::release_store(&_needs_cleanup, true); Atomic::release_store_fence(&needs_cleanup_requested, true); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1559390909 From coleenp at openjdk.org Wed Apr 10 13:02:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 10 Apr 2024 13:02:10 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v3] In-Reply-To: References: <-D7-AtzguOYwUaGTikkkbMAZxgwJegnd6waGVCRSnbI=.b4b4476e-cec2-49b8-9d4a-c92757e6c432@github.com> <7vZckLkhEWmlTpRHtBmdIX0EpPaGdDzUD9HHQEoRzQQ=.09e10796-5464-4c3f-bbd5-744a8e4cf7d7@github.com> <26iDzUx03Fbd_daAG42xdf32HuHIGidWU5RqM_aumQ4=.d8475739-1e00-4a51-919f-87888100e957@github.com> Message-ID: On Wed, 10 Apr 2024 12:59:31 GMT, Coleen Phillimore wrote: >> Ah, okay. Then this seems fine. Maybe tweak or remove the comment to reflect this conversation. It seems to be saying the opposite of what it is doing. > > The reason we have load_acquire is to match the release_store here. This can be running concurrently with the ServiceThread checking whether cleanup is needed. This first flag says this OopStorage needs cleanup, the second is the global. It seems like the order is important. Therefore, we use all the load_acquire/release_store operations for safety and consistency, and to avoid future head scratching. > > // Record that cleanup is needed, without notifying the Service thread, because > // we can't lock the Service_lock. Used by release(). > void OopStorage::record_needs_cleanup() { > // Set local flag first, else ServiceThread could wake up and miss > // the request. > Atomic::release_store(&_needs_cleanup, true); > Atomic::release_store_fence(&needs_cleanup_requested, true); > } I removed the comment that used to describe the CAS. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18375#discussion_r1559391874 From mdoerr at openjdk.org Wed Apr 10 13:22:11 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Apr 2024 13:22:11 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v7] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 17:25:04 GMT, Stewart X Addison wrote: >> Pinging @sxa - what build environment does temurin use for AIX? > > Currently XLC16 but looking to upgrade to XLC17 on the minimum supported level for it (So it wouldn't be SP7 at present). Thanks for the ping - we have no current plans to increase to SP7. Seems like we need to keep it. This is unfortunate. I wouldn't risk mixing malloc and vec_malloc. Who knows what kind of problems this could cause? What happens if we try to build this code on AIX 7.2 TL5 SP7? Will the compiler complain because `malloc` is no longer defined? Should we check `defined(malloc)` in addition? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559425371 From jkern at openjdk.org Wed Apr 10 13:33:14 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 10 Apr 2024 13:33:14 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v7] In-Reply-To: References: Message-ID: <2KbsDhW7N8i7j6hn7zL9msQxCyw6SRCBuJgjsQf-o4Y=.88840d2a-c8c1-4d86-ada0-dfc53226b01f@github.com> On Wed, 10 Apr 2024 13:19:50 GMT, Martin Doerr wrote: >> Currently XLC16 but looking to upgrade to XLC17 on the minimum supported level for it (So it wouldn't be SP7 at present). Thanks for the ping - we have no current plans to increase to SP7. > > Seems like we need to keep it. This is unfortunate. I wouldn't risk mixing malloc and vec_malloc. Who knows what kind of problems this could cause? > What happens if we try to build this code on AIX 7.2 TL5 SP7? Will the compiler complain because `malloc` is no longer defined? Should we check `defined(malloc)` in addition? We already built this code since months on AIX 7.2 TL5 SP7, because we raised the OS. This code is needed on SP5 and does not hurt SP7. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559441769 From lois.foltan at oracle.com Wed Apr 10 13:33:55 2024 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 10 Apr 2024 13:33:55 +0000 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: Vote: yes Lois On 4/10/24, 8:25?AM, "hotspot-dev" wrote: I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From lois.foltan at oracle.com Wed Apr 10 13:34:05 2024 From: lois.foltan at oracle.com (Lois Foltan) Date: Wed, 10 Apr 2024 13:34:05 +0000 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: Vote: yes Lois From: hotspot-dev on behalf of Thomas St?fe Date: Wednesday, April 10, 2024 at 8:27?AM To: Jesper Wilhelmsson Cc: hotspot-dev at openjdk.org Subject: Re: CFV: New HotSpot Group Member: Afshin Zafari Vote: yes On Wed, Apr 10, 2024 at 2:24?PM Jesper Wilhelmsson > wrote: I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From jwaters at openjdk.org Wed Apr 10 13:38:15 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 10 Apr 2024 13:38:15 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 10:13:37 GMT, Joachim Kern wrote: >> Can `-Dalloca=__builtin_alloca` replace `#include `? > > Yes I believe. I will remove the `#pragma alloca` everywhere, I will remove the `#include ` everywhere and I will add `-Dalloca=__builtin_alloca` to the compile commands. If it works I will update the PR. In my humble opinion the inclusion of alloca.h was slightly cleaner, but I guess it doesn't matter. Out of curiosity, why do you guys prefer not including it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559450230 From mdoerr at openjdk.org Wed Apr 10 13:49:12 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Apr 2024 13:49:12 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 13:35:39 GMT, Julian Waters wrote: >> Yes I believe. I will remove the `#pragma alloca` everywhere, I will remove the `#include ` everywhere and I will add `-Dalloca=__builtin_alloca` to the compile commands. If it works I will update the PR. > > In my humble opinion the inclusion of alloca.h was slightly cleaner, but I guess it doesn't matter. Out of curiosity, why do you guys prefer not including it? When only looking at AIX code, I think the inclusion of alloca.h was cleaner. Agreed. The new code makes AIX behave like other platforms and avoids the AIX specific part in shared code. I could live with either version. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559470659 From roberto.castaneda.lozano at oracle.com Wed Apr 10 13:51:13 2024 From: roberto.castaneda.lozano at oracle.com (Roberto Castaneda Lozano) Date: Wed, 10 Apr 2024 13:51:13 +0000 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: Vote: yes ________________________________________ From: hotspot-dev on behalf of Jesper Wilhelmsson Sent: Wednesday, April 10, 2024 2:24 PM To: hotspot-dev at openjdk.org Subject: CFV: New HotSpot Group Member: Fredrik Bredberg I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination.? Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote From roberto.castaneda.lozano at oracle.com Wed Apr 10 13:51:30 2024 From: roberto.castaneda.lozano at oracle.com (Roberto Castaneda Lozano) Date: Wed, 10 Apr 2024 13:51:30 +0000 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: Vote: yes ________________________________________ From: hotspot-dev on behalf of Jesper Wilhelmsson Sent: Wednesday, April 10, 2024 2:24 PM To: hotspot-dev at openjdk.org Subject: CFV: New HotSpot Group Member: Afshin Zafari I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination.? Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote From richard.reingruber at sap.com Wed Apr 10 13:56:32 2024 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 10 Apr 2024 13:56:32 +0000 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: Vote: yes Richard. On 10.04.24, 14:25, "hotspot-dev" wrote: I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. Votes are due by April 24, 2024. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuefe at openjdk.org Wed Apr 10 14:24:16 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 10 Apr 2024 14:24:16 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v7] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 12:15:34 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > saver solution This looks good to me now, provided Martin likes it too. Thanks for incorporating my suggestions. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1991847641 From stuefe at openjdk.org Wed Apr 10 14:24:17 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 10 Apr 2024 14:24:17 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 13:46:11 GMT, Martin Doerr wrote: >> In my humble opinion the inclusion of alloca.h was slightly cleaner, but I guess it doesn't matter. Out of curiosity, why do you guys prefer not including it? > > When only looking at AIX code, I think the inclusion of alloca.h was cleaner. Agreed. The new code makes AIX behave like other platforms and avoids the AIX specific part in shared code. > I could live with either version. I can live with either, too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1559536724 From mdoerr at openjdk.org Wed Apr 10 14:28:16 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 10 Apr 2024 14:28:16 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v7] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 12:15:34 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > saver solution Yes, I like it too ? Thanks, Thomas, for your helpful feedback! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1991857617 From lmesnik at openjdk.org Wed Apr 10 15:11:00 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 10 Apr 2024 15:11:00 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake [v2] In-Reply-To: <4pX8bccxXZCq1XNpmOpjY4fRQ6G9TZiv_BYTlw6hIxU=.9c9fda52-ce07-40d3-9528-37c140986fe1@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> <4pX8bccxXZCq1XNpmOpjY4fRQ6G9TZiv_BYTlw6hIxU=.9c9fda52-ce07-40d3-9528-37c140986fe1@github.com> Message-ID: On Wed, 10 Apr 2024 02:34:37 GMT, Serguei Spitsyn wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> >> Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove self from args; add asserts Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18570#pullrequestreview-1991985994 From lmesnik at openjdk.org Wed Apr 10 15:11:10 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 10 Apr 2024 15:11:10 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v3] In-Reply-To: References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Wed, 10 Apr 2024 03:17:32 GMT, Serguei Spitsyn wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: add some asserts Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18574#pullrequestreview-1991987365 From lmesnik at openjdk.org Wed Apr 10 15:11:59 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 10 Apr 2024 15:11:59 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: References: Message-ID: <3jYqwExBxKW33xaSmp_FutY8nOArPG4ZnOZ7Uo2UjQg=.0aab5ec5-ad39-405e-ac4b-199f2c07116a@github.com> On Wed, 10 Apr 2024 04:21:23 GMT, Serguei Spitsyn wrote: >> The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: refactored to get rid of overloaded doit functions Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18630#pullrequestreview-1991988172 From daniel.daugherty at oracle.com Wed Apr 10 15:33:47 2024 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 10 Apr 2024 11:33:47 -0400 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: <76735ab7-5bf2-4d5b-93d6-f68cf8768570@oracle.com> Vote: yes Dan On 4/10/24 8:24 AM, Jesper Wilhelmsson wrote: > I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. > > Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From daniel.daugherty at oracle.com Wed Apr 10 15:34:18 2024 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Wed, 10 Apr 2024 11:34:18 -0400 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: <9f05a1e2-ab56-44c7-b878-56690ce4e7f0@oracle.com> Vote: yes Dan On 4/10/24 8:24 AM, Jesper Wilhelmsson wrote: > I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. > > Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From aph at openjdk.org Wed Apr 10 15:41:41 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 10 Apr 2024 15:41:41 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 78 commits: - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - Merge branch 'clean' into JDK-8180450 - InlineSecondarySupersTest is on by default. - InlineSecondarySupersTest is on by default. - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - ... and 68 more: https://git.openjdk.org/jdk/compare/b80ba085...8dc2ac13 ------------- Changes: https://git.openjdk.org/jdk/pull/18309/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=08 Stats: 1914 lines in 39 files changed: 1862 ins; 18 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From aph at openjdk.org Wed Apr 10 15:41:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 10 Apr 2024 15:41:42 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: <7GV5r_PRvhZkA1zh1WePp6crNNz9vGvxBfPqAG-YnUM=.fcb99891-456b-476a-ad7b-c7c045b3b254@github.com> References: <7GV5r_PRvhZkA1zh1WePp6crNNz9vGvxBfPqAG-YnUM=.fcb99891-456b-476a-ad7b-c7c045b3b254@github.com> Message-ID: On Tue, 19 Mar 2024 02:54:36 GMT, Vladimir Ivanov wrote: >>> I'm unclear why you are presenting this as a Diagnostic feature? I would expect either Experimental if you consider it early days and want more feedback; or else a full Product option that people can opt-in to using, and which eventually becomes the default. >> >> The code for x86 and AArch64 is product ready, but other platforms aren't yet done. I think it should be enabled by default, but YMMV. >> >> The new -XX options will surely be needed by the maintainers of other platforms during porting. They may also be useful during the review phase of this patch, for reviewers to test before/after performance on their own systems. >> >> `UseSecondarySuperCache` and `HashSecondarySupers` perhaps make sense as (rather esoteric) product flags, but`VerifySecondarySupers` and `StressSecondarySuperHash` are strictly for port maintainers. > >> I'm unclear why you are presenting this as a Diagnostic feature? > > From a user perspective, there should be no reason to specify `HashSecondarySupers` or `UseSecondarySuperCache` unless you diagnose a performance regression or looking for a workaround for a crash. There should be no other reason to override default value and switch to the old implementation once `HashSecondarySupers` is supported on a platform. > > IMO diagnostic flags are well justified here. Experimental flag doesn't cut it since the feature is turned on by default and making the flags product puts unnecessary obligations on the VM for something so obscure (from a user perspective). > > Some other VM flags followed the same practice (e.g., `UseVtableBasedCHA`) and so far it worked well. Temporarily converted this to Draft status while @iwanowww and I are working on it. Back soon. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2023198100 From vlivanov at openjdk.org Wed Apr 10 15:41:42 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 10 Apr 2024 15:41:42 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v8] In-Reply-To: References: Message-ID: <6Ql-_Ar7m6YoItKWE-FhWhrdX9fVcje8jmDjqGZZwyE=.bb20734b-4fb4-4e65-af9d-4a06f87b2a4b@github.com> On Wed, 27 Mar 2024 14:15:39 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well One clarification on this particular point: > I would have liked to save the hashed interfaces list in the CDS archive file, ... As the patch works now, archiving of hashed secondary supers array works just fine. The only limitation is there's no sharing happening between `transitive_interfaces` and `secondary_supers` arrays when their content is equal. But sharing is disabled irrespective of whether CDS is in play or not. (See `InstanceKlass::compute_secondary_supers()`.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2023369826 From aph at openjdk.org Wed Apr 10 15:41:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 10 Apr 2024 15:41:42 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v8] In-Reply-To: <6Ql-_Ar7m6YoItKWE-FhWhrdX9fVcje8jmDjqGZZwyE=.bb20734b-4fb4-4e65-af9d-4a06f87b2a4b@github.com> References: <6Ql-_Ar7m6YoItKWE-FhWhrdX9fVcje8jmDjqGZZwyE=.bb20734b-4fb4-4e65-af9d-4a06f87b2a4b@github.com> Message-ID: <5D9F_EmUjdcdB-CWrTIyQdYieslpHxR8W1Ibdzkwmbs=.ccbce88d-0a27-4391-9c71-a65ea8f9e2f0@github.com> On Wed, 27 Mar 2024 17:27:11 GMT, Vladimir Ivanov wrote: > One clarification on this particular point: > > > I would have liked to save the hashed interfaces list in the CDS archive file, ... > > As the patch works now, archiving of hashed secondary supers array works just fine. The only limitation is there's no sharing happening between `transitive_interfaces` and `secondary_supers` arrays when their content is equal. But sharing is disabled irrespective of whether CDS is in play or not. (See `InstanceKlass::compute_secondary_supers()`.) Ah yes, point taken. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2024754589 From aph at openjdk.org Wed Apr 10 15:41:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 10 Apr 2024 15:41:42 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v8] In-Reply-To: References: Message-ID: <8IJL4wOlGxtJzRYVDnZ-eLacjx0syrHAJd5C9ABzBFA=.16566954-8936-4e85-8404-549dfc040200@github.com> On Wed, 27 Mar 2024 14:15:39 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Reopening this one. Vladimir @iwanowww has reorganized much of the code and added comments, mostly to aid readability. He also made about a hundred minor changes ? many of which were improvements. ? Thanks very much to him. There is one really substantial change. Vladimir pointed out that the new lookup code is substantially larger than before, particularly for x86, and this might in some cases change inlining behaviour and cause regressions. To ameliorate that I've added another option, `-XX:-InlineSecondarySupersTest`, which generates stubs for the search code. While that does solve the code expansion problem, the additional call&return overhead doubles the time for each lookup, so I'm reluctant to recommend it for general use. Please review this patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2047724670 From iklam at openjdk.org Wed Apr 10 16:36:32 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Apr 2024 16:36:32 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump Message-ID: The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. ------------- Commit messages: - 8323900: Avoid calling os::init_random() in CDS static dump Changes: https://git.openjdk.org/jdk/pull/18728/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18728&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8323900 Stats: 24 lines in 4 files changed: 14 ins; 6 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18728.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18728/head:pull/18728 PR: https://git.openjdk.org/jdk/pull/18728 From ascarpino at openjdk.org Wed Apr 10 17:22:09 2024 From: ascarpino at openjdk.org (Anthony Scarpino) Date: Wed, 10 Apr 2024 17:22:09 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 19:19:59 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s >> >> Performance, no intrinsic: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57... > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > remove use of jdk.crypto.ec In `ECOperations.java`, if I understand this correctly, it is to replace the existing `PointMultiplier` with montgomery-based PointMuliplier. But when I look at the code, I see both are still options. If I read this correctly, it checks for the old `IntegerFieldModuloP`, then looks for the new `IntegerMontgomeryFieldModuloP`. It appears to use the new one always. Why doesn't it just replace the old implementation entry in the `fields` Map? Is there a reason to keep it around? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2048090075 From iklam at openjdk.org Wed Apr 10 17:37:21 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Apr 2024 17:37:21 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v4] In-Reply-To: References: Message-ID: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. > > Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora and @calvinccheung comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18669/files - new: https://git.openjdk.org/jdk/pull/18669/files/05afb6ed..6471fca1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=02-03 Stats: 49 lines in 6 files changed: 18 ins; 9 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/18669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 PR: https://git.openjdk.org/jdk/pull/18669 From iklam at openjdk.org Wed Apr 10 17:42:12 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Apr 2024 17:42:12 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v3] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 04:50:15 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> Check class name for valid UTF8 encoding > > src/hotspot/share/cds/classListParser.cpp line 179: > >> 177: _token = nullptr; >> 178: _line_len = 0; >> 179: error("Input line too long"); // will exit JVM > > Can you print the line number and/or line length? The line number is printed inside `error()`. Also, I changed the error message to "Out of memory", as we'd come here only if LineReader fails to expand its internal buffer. If the line is too long, it will be broken into multiple chunks. See the gtest case. > src/hotspot/share/cds/classListParser.cpp line 465: > >> 463: err = "class name too long"; >> 464: } else { >> 465: int len = (int)strlen(class_name); > > Suggestion: save the length first so you don't have to calculate it twice. Done. > src/hotspot/share/utilities/lineReader.cpp line 41: > >> 39: _buffer_len = 0; >> 40: init(file); >> 41: } > > Style nit: Shouldn't we be using initializer lists rather than the constructor body for these simple initializations? Done > src/hotspot/share/utilities/lineReader.cpp line 90: > >> 88: // _buffer_len will stop at MAX_LEN, so we will never be able to read more than >> 89: // MAX_LEN chars for a single input line. >> 90: assert(line_len >= 0 && new_len >= 0 && (line_len + new_len) >= 0, "no int overflow"); > > The overflow test is relying on UB you need to check subtraction from INT_MAX Fixed > src/hotspot/share/utilities/lineReader.cpp line 110: > >> 108: return _buffer; >> 109: } >> 110: int new_len = _buffer_len * 2; > > Again UB on the overflow check. MAX_LEN should be set so that doubling of the current size will always hit max_len so you can't overflow. Fixed > src/hotspot/share/utilities/lineReader.hpp line 37: > >> 35: // of text-based input files for HotSpot. Don't use LineReader if it's >> 36: // possible for valid lines to be longer than this limit. >> 37: class LineReader : public StackObj { > > Suggestion: LineReader should also track the line count. Done. As a result, I removed `ClassListParser::_line_no`. > src/hotspot/share/utilities/lineReader.hpp line 61: > >> 59: // When successful, a non-null value is returned. The caller is free to read or modify this >> 60: // string (up to the terminating \0 character) until the next call to read_line(), or until the >> 61: // LineReader is destructed. > > s/destructed/destroyed/ Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559845162 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559845081 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559845012 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559844933 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559844873 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559844698 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1559844670 From never at openjdk.org Wed Apr 10 17:50:14 2024 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 10 Apr 2024 17:50:14 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. src/hotspot/share/opto/escape.cpp line 560: > 558: const Type* cast_t = _igvn->type(use); > 559: if (cast_t == nullptr || cast_t->make_ptr()->isa_instptr() == nullptr) { > 560: NOT_PRODUCT(use->dump();) This dump should be guarded by TraceReduceAllocationMerges as should the one at line 574 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1559853541 From iklam at openjdk.org Wed Apr 10 17:54:25 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 10 Apr 2024 17:54:25 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v5] In-Reply-To: References: Message-ID: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. > > Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into 8329728-read-arbitrary-long-lines-in-class-list-parser - @dholmes-ora and @calvinccheung comments - Check class name for valid UTF8 encoding - @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K - 8329728: Read arbitrarily long lines in ClassListParser ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18669/files - new: https://git.openjdk.org/jdk/pull/18669/files/6471fca1..f6ef76f0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=03-04 Stats: 10071 lines in 337 files changed: 5970 ins; 2426 del; 1675 mod Patch: https://git.openjdk.org/jdk/pull/18669.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 PR: https://git.openjdk.org/jdk/pull/18669 From duke at openjdk.org Wed Apr 10 18:05:10 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 10 Apr 2024 18:05:10 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 17:18:55 GMT, Anthony Scarpino wrote: > In `ECOperations.java`, if I understand this correctly, it is to replace the existing `PointMultiplier` with montgomery-based PointMuliplier. But when I look at the code, I see both are still options. If I read this correctly, it checks for the old `IntegerFieldModuloP`, then looks for the new `IntegerMontgomeryFieldModuloP`. It appears to use the new one always. Why doesn't it just replace the old implementation entry in the `fields` Map? Is there a reason to keep it around? Hmm, thats a good point I haven't fully considered; i.e. (if I read correctly) "for `CurveDB.P_256` remove the fallback path to non-montgomery entirely".. that might also help in cleaning a few things up in the construction. Maybe even get rid of this nested ECOperations inside ECOperations.. Perhaps nesting isnt a big deal, but all attempts to make the ECC stack clearer is positive! One functional reason that might justify keeping it as-is, is fuzz-testing; with the fallback available, I am able to write the included Fuzz tests and have them check the values against the existing implementation. While I also included a few KAT tests using openssl-generated values, the fuzz tests check millions of values and it does add a lot more certainty about correctness of this code. Can it be removed? For the operations that do not involve multiplication (i.e. `setSum(*)`), montgomery is expensive. I think I did go through the uses of this code some time back (i.e. ECDHE, ECDSA and KeyGeneration) and existing IntegerPolynomialP256 is no longer used (I should verify that again) and only P256OrderField remains non-montgomery. So removing references to IntegerPolynomialP256 in ECOperations should be possible and cleaner. Removing IntegerPolynomialP256 from MontgomeryIntegerPolynomialP256 is harder (fromMontgomery() uses IntegerPolynomialP256) but perhaps also worth some thought.. I tend to like `ECOperationsFuzzTest.java` and would prefer to keep it, but it could also be chucked up as part of 'scaffolding' and removed in name of code quality? Thanks @ascarpino PS: Perhaps there is some middle ground, remove the `ECOperations montgomeryOps` nesting, and construct (somehow?? singleton makes most things inaccessible..) the reference ECOperations in the fuzz test instead.. not sure how yet, but perhaps worth a further thought.. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2048159645 From sviswanathan at openjdk.org Wed Apr 10 18:08:11 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 10 Apr 2024 18:08:11 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v7] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <3zfjFlFN7zLHvF7jU3jtpVqLDbqiLMsofAzkYOAoqsk=.67a431df-5a4d-4a87-b333-3e6e3da7fad6@github.com> On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add movq to locate_operand To me doing the compiler intrinsic for Unsafe setMemory looked to be the right first step as in this PR. There is precedence with Unsafe copyMemory intrinsic so we are in sync with what is done before. We could request Fei Yang/Hamlin Li for RISC-V intrinsic and Bhavana Kilambi/Nick Gasson for AARCH64 intrinsic as follow up PRs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2048164233 From rpressler at openjdk.org Wed Apr 10 18:09:01 2024 From: rpressler at openjdk.org (Ron Pressler) Date: Wed, 10 Apr 2024 18:09:01 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 20:16:28 GMT, Patricio Chilano Mateo wrote: >> Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). >> >> This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. >> >> When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. >> >> I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. >> >> The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. >> The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. >> >> I tested the current patch with the new test and also run it through mach5 tiers1-6. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > use WhiteBox to verify OSR compilation It may be hard to do a proper measurement because the number of methods in our microbenchmarks is small. We're also talking an extra branch, I think. This is code than can be called a million times per second per core. It's very performance sensitive. So I would prefer to first see if there's an impact on nmethod size, and only if there is consider whether the speed implications are acceptable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2048165720 From matsaave at openjdk.org Wed Apr 10 18:19:02 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 10 Apr 2024 18:19:02 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v5] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 17:54:25 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8329728-read-arbitrary-long-lines-in-class-list-parser > - @dholmes-ora and @calvinccheung comments > - Check class name for valid UTF8 encoding > - @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > - 8329728: Read arbitrarily long lines in ClassListParser New changes look good, thanks! ------------- Marked as reviewed by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18669#pullrequestreview-1992390893 From ccheung at openjdk.org Wed Apr 10 18:35:10 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 10 Apr 2024 18:35:10 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v5] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 17:54:25 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8329728-read-arbitrary-long-lines-in-class-list-parser > - @dholmes-ora and @calvinccheung comments > - Check class name for valid UTF8 encoding > - @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > - 8329728: Read arbitrarily long lines in ClassListParser Marked as reviewed by ccheung (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18669#pullrequestreview-1992416138 From mli at openjdk.org Wed Apr 10 18:49:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 10 Apr 2024 18:49:10 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Wed, 10 Apr 2024 09:24:09 GMT, Hamlin Li wrote: > > Thank you for the update and for working on this in general. > > I've started working on JDK-8329816, preparing the change for the SLEEF specific part of the change. Specifically, I'm currently planning on including the three SLEEF header files, the README and a legal/sleef.md file in that change. Let me know if you have any thoughts/concerns. > > Thanks a lot, that's a great news. Please go ahead to integrate the files via JDK-8329816. :) Besides of the performance issue currently found out, I have no other concerns. I found the root cause of the performance regression, and have a draft solution for it, I'm running a thorough benchmark to see if it works for all sleef functions we use in jdk. So, basically this solution is good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2048217671 From kbarrett at openjdk.org Wed Apr 10 18:55:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Apr 2024 18:55:09 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v4] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 17:40:14 GMT, Coleen Phillimore wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > More comment updates. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18375#pullrequestreview-1992446052 From vlivanov at openjdk.org Wed Apr 10 19:07:15 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 10 Apr 2024 19:07:15 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 15:41:41 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 78 commits: > > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - Merge branch 'clean' into JDK-8180450 > - InlineSecondarySupersTest is on by default. > - InlineSecondarySupersTest is on by default. > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - ... and 68 more: https://git.openjdk.org/jdk/compare/b80ba085...8dc2ac13 I'm happy with the current state of the patch. Thanks a lot for incorporating the changes I proposed. I find it easier to reason about the implementation now and hope it'll help others navigating in the code. > ... new lookup code is substantially larger than before, particularly for x86, and this might in some cases change inlining behaviour and cause regressions. To ameliorate that I've added another option, -XX:-InlineSecondarySupersTest, which generates stubs for the search code. While that does solve the code expansion problem, the additional call&return overhead doubles the time for each lookup, so I'm reluctant to recommend it for general use. Alternatively, C2 inlining heuristics can be taught to discount inlined part of secondary supers lookup in generated code. It was the remediation chosen [1] for regressions introduced by post-call nops (part of Loom support). > I would have liked to save the hashed interfaces list in the CDS archive file, but there are test cases for the Serviceability Agent that assume the interfaces are in the order in which they are declared. I think the tests are probably just wrong about this. For the same reason, I have to make copies of the interfaces array and sort the copies, rather than just using the same array at runtime. I checked that CDS support works fine with the latest PR. I took a look at transitive interface sharing and it turned out to be a bit more complicated than a SA-specific test issue. Both `transitive_interfaces` and `local_interface` arrays can be used as `secondary_supers`. While the order of elements is non-significant for `transitive_interfaces`, `local_interfaces` determines initialization order of superinterfaces mandated by JVMS [2] (see `InstanceKlass::initialize_super_interfaces()`). There's another minor issue with reusing arrays from CDS archive (see newly introduced comment in Klass::remove_unshareable_info() [3]), but with the following patch (on top up-to-date PR state) I see only a single test failure [4] which is indeed SA-specific and looks like a test bug to me: https://github.com/iwanowww/jdk/commit/f701e8bdb5269f144aa9c70e9dce9076394f09cf [1] https://bugs.openjdk.org/browse/JDK-8300002 [2] JVMS-5.5 "7. Next, if C is a class rather than an interface, then let SC be its superclass and let SI1, ..., SIn be all superinterfaces of C (whether direct or indirect) that declare at least one non-abstract, non-static method. The order of superinterfaces is given by a recursive enumeration over the superinterface hierarchy of each interface directly implemented by C. For each interface I directly implemented by C (in the order of the interfaces array of C), the enumeration recurs on I's superinterfaces (in the order of the interfaces array of I) before returning I." [3] Klass::remove_unshareable_info() // FIXME: validation in Klass::hash_secondary_supers() may fail for shared klasses. // Even though the bitmaps always match, the canonical order of elements in the table // is not guaranteed to stay the same (see tie breaker during Robin Hood hashing in Klass::hash_insert). //assert(compute_secondary_supers_bitmap(secondary_supers()) == _bitmap, "broken table"); [4] serviceability/dcmd/vm/ClassHierarchyTest.java test ClassHierarchyTest.jmx(): failure java.lang.AssertionError: Failed to match line #6: | | implements Intf2/0x00006000032b67d0 (inherited intf) Running DCMD 'VM.class_hierarchy DcmdBaseClass -i -s' through 'JMXExecutor' ---------------- stdout ---------------- java.lang.Object/null |--DcmdBaseClass/0x00006000032b67d0 | implements Intf2/0x00006000032b67d0 (declared intf) | implements Intf1/0x00006000032b67d0 (inherited intf) | |--DcmdTestClass/0x00006000032b67d0 | | implements Intf2/0x00006000032b67d0 (inherited intf) | | implements Intf1/0x00006000032b67d0 (inherited intf) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2048252083 From vlivanov at openjdk.org Wed Apr 10 19:26:14 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 10 Apr 2024 19:26:14 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: <39WHBecCuPC4Y3c2kAP09unMLgxqy9uFYGdWJjKREd0=.6a89bf7c-7fd5-4c16-9e45-cac9810d216b@github.com> On Wed, 10 Apr 2024 15:41:41 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 78 commits: > > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - Merge branch 'clean' into JDK-8180450 > - InlineSecondarySupersTest is on by default. > - InlineSecondarySupersTest is on by default. > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - ... and 68 more: https://git.openjdk.org/jdk/compare/b80ba085...8dc2ac13 src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 3: > 1: /* > 2: * Copyright (c) 1997, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2015, 2024, Red Hat Inc. All rights reserved. Redundant update. (No other changes in the file.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1559950710 From dlong at openjdk.org Wed Apr 10 19:50:09 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 10 Apr 2024 19:50:09 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 20:16:28 GMT, Patricio Chilano Mateo wrote: >> Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). >> >> This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. >> >> When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. >> >> I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. >> >> The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. >> The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. >> >> I tested the current patch with the new test and also run it through mach5 tiers1-6. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > use WhiteBox to verify OSR compilation Marked as reviewed by dlong (Reviewer). OK, let's go with the new nmethod field. ------------- PR Review: https://git.openjdk.org/jdk/pull/18637#pullrequestreview-1992551307 PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2048319631 From vlivanov at openjdk.org Wed Apr 10 19:56:22 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 10 Apr 2024 19:56:22 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: <28M-uMy6qLCTTS4tR8GkC7SHCbOAvDw0WP3y6NeO-8I=.1aa0fbe0-81a0-4b13-a921-12e32e11bd8b@github.com> On Wed, 10 Apr 2024 15:41:41 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 78 commits: > > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - Merge branch 'clean' into JDK-8180450 > - InlineSecondarySupersTest is on by default. > - InlineSecondarySupersTest is on by default. > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - ... and 68 more: https://git.openjdk.org/jdk/compare/b80ba085...8dc2ac13 Also, I ended up writing stress test to exercise new code: https://github.com/iwanowww/jdk/commit/7d0524d5f26ecdfdddd1d10cc761cf870d78b52b The test needs some polishing, but it turned out to be quite useful when modifying code related to subtype checking. So, worth considering incorporating it into the repo at some later point. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2048327852 From sspitsyn at openjdk.org Wed Apr 10 20:27:18 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 20:27:18 GMT Subject: RFR: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake [v2] In-Reply-To: <4pX8bccxXZCq1XNpmOpjY4fRQ6G9TZiv_BYTlw6hIxU=.9c9fda52-ce07-40d3-9528-37c140986fe1@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> <4pX8bccxXZCq1XNpmOpjY4fRQ6G9TZiv_BYTlw6hIxU=.9c9fda52-ce07-40d3-9528-37c140986fe1@github.com> Message-ID: On Wed, 10 Apr 2024 02:34:37 GMT, Serguei Spitsyn wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> >> Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove self from args; add asserts Leonid, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18570#issuecomment-2048376644 From sspitsyn at openjdk.org Wed Apr 10 20:28:17 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 10 Apr 2024 20:28:17 GMT Subject: RFR: 8329491: GetThreadListStackTraces function should use JvmtiHandshake [v3] In-Reply-To: References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Wed, 10 Apr 2024 03:17:32 GMT, Serguei Spitsyn wrote: >> The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: add some asserts Leonid, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18574#issuecomment-2048377357 From bulasevich at openjdk.org Wed Apr 10 21:24:46 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 10 Apr 2024 21:24:46 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Mon, 8 Apr 2024 06:09:14 GMT, Boris Ulasevich wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Fix AArch64 build & improve comment about InstructionMark >> - Catching up with changes in master >> - Catching up with origin/master >> - Catch up with origin/master >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 > > Do you need help understanding the problem? The crash occurred because you removed the line `fprintf(fp, " cbuf.set_insts_mark();\n");` from the generator of AD nodes ::emit() methods. That is why emit_call_reloc finds cbuf.insts->_mark unitialized. > > > // Call Runtime Instruction > instruct CallRuntimeDirect(method meth) %{ > match(CallRuntime); > effect(USE meth); > ins_cost(CALL_COST); > format %{ "CALL,runtime" %} > ins_encode( Java_To_Runtime( meth ), > call_epilog ); > ins_pipe(simple_call); > %} > > --> > > void CallRuntimeDirectNode::emit(CodeBuffer& cbuf, PhaseRegAlloc* ra_) const { > cbuf.set_insts_mark(); > // Start at oper_input_base() and count operands > unsigned idx0 = 1; > unsigned idx1 = 1; // > { > #line 1217 "/home/boris/jdk-bulasevich/src/hotspot/cpu/arm/arm.ad" > // CALL directly to the runtime > emit_call_reloc(cbuf, as_MachCall(), opnd_array(1), runtime_call_Relocation::spec()); > #line 999999 > } > { > #line 1213 "/home/boris/jdk-bulasevich/src/hotspot/cpu/arm/arm.ad" > // nothing > #line 999999 > } > } > @bulasevich - I just pushed a fix for ARM32. Can you please run your tests again? Thanks! Good. Tests are OK now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2048456503 From bulasevich at openjdk.org Wed Apr 10 21:24:47 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 10 Apr 2024 21:24:47 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v12] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 19:10:30 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix ARM32 AD file src/hotspot/cpu/arm/arm.ad line 1877: > 1875: %} > 1876: > 1877: // Pointer Immediate Why do you introduce immN operands for arm32? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1560068373 From dean.long at oracle.com Wed Apr 10 21:58:10 2024 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 10 Apr 2024 14:58:10 -0700 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: Vote: yes From kbarrett at openjdk.org Wed Apr 10 22:16:44 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Apr 2024 22:16:44 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 14:19:59 GMT, Thomas Stuefe wrote: >> When only looking at AIX code, I think the inclusion of alloca.h was cleaner. Agreed. The new code makes AIX behave like other platforms and avoids the AIX specific part in shared code. >> I could live with either version. > > I can live with either, too. That build failure in shared code does not happen with Xcode clang, gcc, or Visual Studio, even though none of them appear to have a relevant define or include. So the clang variant being used for AIX is different from the Xcode clang variant (and maybe others) in its treatment of alloca. Weird! I can also live with either the macro or the includes where needed. I dislike conditionally adding the include in globalDefinitions_gcc.hpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1560113548 From kbarrett at openjdk.org Wed Apr 10 22:16:45 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 Apr 2024 22:16:45 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 22:12:42 GMT, Kim Barrett wrote: >> I can live with either, too. > > That build failure in shared code does not happen with Xcode clang, gcc, or > Visual Studio, even though none of them appear to have a relevant define or > include. So the clang variant being used for AIX is different from the Xcode > clang variant (and maybe others) in its treatment of alloca. Weird! > > I can also live with either the macro or the includes where needed. I dislike > conditionally adding the include in globalDefinitions_gcc.hpp. Should also remove the `#pragma alloca` in os_aix.cpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1560114722 From dholmes at openjdk.org Wed Apr 10 22:19:50 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 10 Apr 2024 22:19:50 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v5] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 17:54:25 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8329728-read-arbitrary-long-lines-in-class-list-parser > - @dholmes-ora and @calvinccheung comments > - Check class name for valid UTF8 encoding > - @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > - 8329728: Read arbitrarily long lines in ClassListParser Updates look good. A couple of nits bit changes approved. Thanks src/hotspot/share/cds/classListParser.cpp line 463: > 461: err = "class name too long"; > 462: } else { > 463: assert(Symbol::max_length() < INT_MAX && len < INT_MAX, "must be"); The first half of the assert is redundant as Symbol_max_length is fixed at 64K and will never change. src/hotspot/share/utilities/lineReader.cpp line 55: > 53: > 54: char* LineReader::read_line() { > 55: STATIC_ASSERT(0 < MAX_LEN && MAX_LEN <= INT_MAX); Given the doubling rule this should check `MAX_LEN <= INT_MAX/2` src/hotspot/share/utilities/lineReader.cpp line 74: > 72: // We have read something in previous loop iteration(s). Return that. > 73: // The next call to read_line() will return nullptr to indicate EOF. > 74: ++ _line_num; Style nit: no space between unary operator and its operand ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18669#pullrequestreview-1992759649 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1560107907 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1560114629 PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1560115794 From vlivanov at openjdk.org Wed Apr 10 23:34:42 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 10 Apr 2024 23:34:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: <70tJacyO4JrzIFiCc9rxO3P46sC5ieEJEdWbSRuKsaU=.73481033-6a81-4ced-a328-f82cd2794ac7@github.com> On Fri, 5 Apr 2024 12:17:17 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - disable unused-function warnings; add log msg > - minor Nice work, Hamlin and Xiaohong. I'm glad to see progress on incorporating SLEEF library into the JDK. (Somehow I missed all previous PRs you posted before.) I'm not a lawyer, so won't comment on 3rd party library sources under Boost Software License in OpenJDK. >From engineering perspective, I believe that bundling vector math library with the JDK is the right thing to do, but it doesn't imply the sources should be part of JDK. There are already examples of optional dependencies on external native libraries in HotSpot (e.g., hsdis tool w/ binutils, capstone, and llvm backends). Speaking of HotSpot-specific changes, IMO it desperately needs a cross-platform interface between vector math libraries and JVM. Most of the changes in `StubGenerator` are library-specific and are irrelevant in the context of the JVM. I do see that you try to replicate SVML logic, but SVML support didn't set a precedent to follow here. For background, SVML stubs were initially contributed to Panama as assembly stubs statically linked into libjvm.so. It was acceptable for experimentation purposes, but not for mainline JDK (even for functionality in incubating module). The compromise was to bundle the stubs as a dynamic library and link against them. And that's how it stayed until today. IMO in order to get SLEEF in, the interaction between JVM and backend native library should be unified. And it should affect both SLEEF and SVML stubs. In particular, I'd like to see all those named lookups to go away from the JVM code. A single call into the library during compiler/VM initialization can produce a fully populated table of function pointers (`StubRoutines::_vector_[fd]_math` now) for C2 to use later. FTR there were other alternatives discussed (use Panama FFI or rewrite the stubs in Vector API itself). The latter (complete rewrite) is still something for a distant future, but Foreign Function API is public API now, so once it supports vector calling conventions, it should become fully capable of satisfying Vector API implementation needs to interact with vector math library. IMO that what we should keep in mind when designing new interface. There's no inherent need to keep vector stub support in the JVM. Once Foreign Function API gains vector support, it should be replaced with a pure Java FFI-based implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2048599089 From cslucas at openjdk.org Wed Apr 10 23:49:48 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 10 Apr 2024 23:49:48 GMT Subject: RFR: 8241503: C2: Share MacroAssembler between mach nodes during code emission [v12] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 21:20:11 GMT, Boris Ulasevich wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix ARM32 AD file > > src/hotspot/cpu/arm/arm.ad line 1877: > >> 1875: %} >> 1876: >> 1877: // Pointer Immediate > > Why do you introduce immN operands for arm32? This was accidental. I'll remove it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1560174793 From cslucas at openjdk.org Wed Apr 10 23:53:00 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 10 Apr 2024 23:53:00 GMT Subject: RFR: 8241503: C2: Share MacroAssembler between mach nodes during code emission [v13] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Remove unused operands in arm.ad ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16484/files - new: https://git.openjdk.org/jdk/pull/16484/files/693c7ef8..44e63ee0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=11-12 Stats: 30 lines in 1 file changed: 0 ins; 30 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From duke at openjdk.org Wed Apr 10 23:59:41 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 10 Apr 2024 23:59:41 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: <48md2WEAhqPyuVf4AYOxBQDykUiOaEL0PQb-ki0_TYM=.6c25bf41-b0ae-49ec-b606-236deb4561e3@github.com> On Fri, 5 Apr 2024 09:17:18 GMT, Jatin Bhateja wrote: > Few early comments. > > Please update the copyright year of all the modified files. > > You can even consider splitting this into two patches, Java side changes in one and x86 optimized intrinsic in next one. Thanks Jatin, will fix! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2048618452 From sviswanathan at openjdk.org Thu Apr 11 00:33:48 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 11 Apr 2024 00:33:48 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v7] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add movq to locate_operand src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2497: > 2495: // > 2496: address StubGenerator::generate_unsafe_setmemory(const char *name, > 2497: address byte_fill_entry) { Need to add UnsafeSetMemoryMark on similar lines as UnsafeCopyMemoryMark to handle page error. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2522: > 2520: #define rScratch3 r8 > 2521: #undef rScratch4 > 2522: #define rScratch4 r11 We could do this setup using const Register declaration instead of using #undef/#define pair. src/hotspot/share/opto/library_call.cpp line 4950: > 4948: > 4949: bool LibraryCallKit::inline_unsafe_setMemory() { > 4950: if (callee()->is_static()) return false; // caller must have the capability! Also need to return false if StubRoutines::unsafe_setmemory() == nullptr. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560201153 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560202218 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560194011 From sviswanathan at openjdk.org Thu Apr 11 00:47:56 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 11 Apr 2024 00:47:56 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v7] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 8 Apr 2024 19:11:19 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add movq to locate_operand src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5988: > 5986: movw(Address(to, 0), value); > 5987: addptr(to, 2); > 5988: subptr(count, 1<<(shift-1)); At line 5968 also we need the change from cmpl to cmpptr. cmpl(count, 2< 6048: vpbroadcastd(xtmp, xtmp, Assembler::AVX_512bit); > 6049: > 6050: subptr(count, 16 << shift); At line 6045 also the cmpl should change to cmpptr: cmpl(count, VM_Version::avx3_threshold()); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560205702 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1560213265 From kbarrett at openjdk.org Thu Apr 11 00:52:42 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 11 Apr 2024 00:52:42 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v7] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 12:15:34 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > saver solution Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1992964955 From dholmes at openjdk.org Thu Apr 11 01:12:44 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Apr 2024 01:12:44 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Wed, 10 Apr 2024 07:46:00 GMT, Richard Reingruber wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup test leftovers > > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1675: > >> 1673: __ mr(ex_oop, R3_RET); >> 1674: __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::log_jni_monitor_still_held)); >> 1675: // Restore potentional return value > > Nit > Suggestion: > > // Restore potential return value Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1560283922 From sspitsyn at openjdk.org Thu Apr 11 01:18:57 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 11 Apr 2024 01:18:57 GMT Subject: Integrated: 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake In-Reply-To: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> References: <5tcPHZX0nNTHbQqZfHRl2riTpJglQyGJ2hRJXyIMZPY=.4de7ac6d-dd84-4943-bab1-5dba67bf5cf0@github.com> Message-ID: On Tue, 2 Apr 2024 00:22:28 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `PopFrame` and `ForceEarlyReturn` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > > Ran mach5 tiers 1-6 This pull request has now been integrated. Changeset: 643dd48a Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/643dd48a2aa05388c55fa728a22885540b967a05 Stats: 45 lines in 3 files changed: 15 ins; 20 del; 10 mod 8329432: PopFrame and ForceEarlyReturn functions should use JvmtiHandshake Reviewed-by: pchilanomate, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/18570 From dholmes at openjdk.org Thu Apr 11 01:27:15 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Apr 2024 01:27:15 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v6] In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks David Holmes has updated the pull request incrementally with one additional commit since the last revision: Fix typos, copyrights and add comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18445/files - new: https://git.openjdk.org/jdk/pull/18445/files/70f43301..4a9fc71f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18445&range=04-05 Stats: 6 lines in 3 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18445.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18445/head:pull/18445 PR: https://git.openjdk.org/jdk/pull/18445 From dholmes at openjdk.org Thu Apr 11 01:27:15 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Apr 2024 01:27:15 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v4] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Tue, 9 Apr 2024 11:31:24 GMT, Richard Reingruber wrote: >> I've done basic smoke testing on PowerPC using QEMU. >> `JAVA_OPTIONS=-XX:+CheckJNICalls TEST=test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java >> ` passes ok. But it would be nice if @TheRealMDoerr or @reinrich could take it for a spin on real hardware. > >> I've done basic smoke testing on PowerPC using QEMU. `JAVA_OPTIONS=-XX:+CheckJNICalls TEST=test/hotspot/jtreg/runtime/vthread/JNIMonitor/JNIMonitor.java ` passes ok. But it would be nice if @TheRealMDoerr or @reinrich could take it for a spin on real hardware. > > Thanks for the pin. We will do that. Thanks for the review comments and testing of PPC @reinrich ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2048748647 From dholmes at openjdk.org Thu Apr 11 01:27:16 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Apr 2024 01:27:16 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: <9jGVA_1nsHA24Yq-WEVWpF4jie0gFLQQCqcUf2xXW3M=.68e82b07-2c4e-4d48-9806-fac3c6f19cf7@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> <9jGVA_1nsHA24Yq-WEVWpF4jie0gFLQQCqcUf2xXW3M=.68e82b07-2c4e-4d48-9806-fac3c6f19cf7@github.com> Message-ID: On Wed, 10 Apr 2024 12:07:42 GMT, Fredrik Bredberg wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup test leftovers > > src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 1056: > >> 1054: __ cbz(rscratch1, L_skip_vthread_code); >> 1055: >> 1056: // Save return value potentially containing the exception oop in callee-saved R19 . > > Suggestion: > > // Save return value potentially containing the exception oop in callee-saved R19. Well spotted! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1560303028 From dholmes at openjdk.org Thu Apr 11 01:27:16 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 11 Apr 2024 01:27:16 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Wed, 10 Apr 2024 08:46:20 GMT, Richard Reingruber wrote: >> David Holmes has updated the pull request incrementally with one additional commit since the last revision: >> >> Cleanup test leftovers > > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1: > >> 1: /* > > Copyright header needs update I updated Oracle and SAP copyrights. > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1673: > >> 1671: // Save return value potentially containing the exception oop >> 1672: Register ex_oop = R15_esp; // nonvolatile register >> 1673: __ mr(ex_oop, R3_RET); > > Please add `R15_esp` to the Kills section above in the header comment. Done > src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp line 1680: > >> 1678: // For vthreads we have to explicitly zero the JNI monitor count of the carrier >> 1679: // on termination. The held count is implicitly zeroed below when we restore from >> 1680: // the parent held count (which has to be zero). > > This comment is not quite correct or a little imprecise, I found. > >> the parent held count (which has to be zero) > > I think technically the held count (_held_monitor_count) could be non-zero. Of course it would likely be bad if it was (holding a monitor while suspended is usually not good). > > I thought it was like this: > > > The JNI monitor count of the carrier thread is required to be zero when > switching to the virtual thread. Here we are switching back to the carrier. > We have to restore its JNI monitor count of zero. In the general case the parent held count might be non-zero, but with vthreads and pinning it has to be zero. What we need to do with this code is ensure the held-count and jni-count are in sync: held-count >= jni-count. If the held-count could be non-zero here then we would not know what value to set in the jni-count to make sure the relationship is correct. It is only because we know the held-count has to be zero that we know we have to set the jni-count to zero. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1560298440 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1560295882 PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1560293758 From ddong at openjdk.org Thu Apr 11 02:32:44 2024 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 11 Apr 2024 02:32:44 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v9] In-Reply-To: References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: On Tue, 20 Feb 2024 05:27:22 GMT, Denghui Dong wrote: >> There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing. >> >> Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production. >> >> ~~This patch adds a new JFR event `EventSafepointTimeout` to record the threads that cause ttsp too long.~~ >> >> ~~This event includes two fields:~~ >> >> ~~- safepointId: the relevant safepoint id~~ >> ~~- timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint~~ >> >> ~~In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.~~ >> >> A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR. >> >> Any input would be greatly appreciated. >> >> Testing: jdk/jdk/jfr > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > delete _entries when disabled Gentle ping. Could anyone review this patch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17888#issuecomment-2048828012 From vlivanov at openjdk.org Thu Apr 11 02:32:49 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 11 Apr 2024 02:32:49 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: <6l8nzOnx_pxin6AxZ8b7SN8smfnhfYcw1e74ausTP4c=.ae6152d1-2d5b-4bb3-9b66-e12afbe4ef1d@github.com> On Wed, 10 Apr 2024 15:41:41 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 78 commits: > > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - Merge branch 'clean' into JDK-8180450 > - InlineSecondarySupersTest is on by default. > - InlineSecondarySupersTest is on by default. > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - ... and 68 more: https://git.openjdk.org/jdk/compare/b80ba085...8dc2ac13 I see new assertion failures on windows-x64 w/ generational ZGC (`-XX:+UseZGC -XX:+ZGenerational`): # Internal Error (...\src\hotspot\share\runtime\stubRoutines.cpp:250), pid=1140, tid=13168 # assert(code_size == 0 || buffer.insts_remaining() > 200) failed: increase _final_stubs_code_size V [jvm.dll+0xe2b0c1] initialize_stubs+0x211 (stubRoutines.cpp:250) V [jvm.dll+0xe2ae01] final_stubs_init+0x41 (stubRoutines.cpp:304) V [jvm.dll+0x78b830] init_globals2+0x70 (init.cpp:184) V [jvm.dll+0xeb62aa] Threads::create_vm+0x43a (threads.cpp:572) V [jvm.dll+0x8b4372] JNI_CreateJavaVM_inner+0x82 (jni.cpp:3581) V [jvm.dll+0x8b87bf] JNI_CreateJavaVM+0x1f (jni.cpp:3672) Looks like we are running out of space for stubs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2048828497 From pchilanomate at openjdk.org Thu Apr 11 03:33:41 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 11 Apr 2024 03:33:41 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 12:11:17 GMT, Axel Boldt-Christmas wrote: > `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. > > This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. > > It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. > > Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. > > Tests still running. Tier 1-5 done, Tier 6-7 running. Fix looks good to me. Looking at 8318895 seems we missed this subtle OSR case when the fix was restricted to Unpack_none only. src/hotspot/share/runtime/deoptimization.cpp line 72: > 70: #include "runtime/fieldDescriptor.inline.hpp" > 71: #include "runtime/frame.inline.hpp" > 72: #include "runtime/globals.hpp" Is this extra include needed? src/hotspot/share/runtime/deoptimization.cpp line 96: > 94: #include "utilities/checkedCast.hpp" > 95: #include "utilities/events.hpp" > 96: #include "utilities/globalDefinitions.hpp" Same. src/hotspot/share/runtime/lockStack.cpp line 119: > 117: if (_base[index] == obj) { > 118: // Found top index > 119: top_index = index + 1; Maybe add a break to make it more readable that we are exiting the loop once we find a match? src/hotspot/share/runtime/lockStack.cpp line 149: > 147: assert(!mark.is_fast_locked(), "must be inflated"); > 148: assert(mark.monitor()->owner_raw() == get_thread() || > 149: get_thread()->current_waiting_monitor() == mark.monitor(), We can add the !leaf_frame condition for the waiting monitor case. ------------- PR Review: https://git.openjdk.org/jdk/pull/18715#pullrequestreview-1993145581 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1560389809 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1560392161 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1560337025 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1560338726 From sspitsyn at openjdk.org Thu Apr 11 04:21:45 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 11 Apr 2024 04:21:45 GMT Subject: Integrated: 8329491: GetThreadListStackTraces function should use JvmtiHandshake In-Reply-To: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> References: <56L6f8XFyrB_cUSPTLWNIVhO0PU4w3PjRnpA5U7y_aI=.906bf099-af40-4192-a205-f84120e99ec8@github.com> Message-ID: On Tue, 2 Apr 2024 08:13:20 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI function `GetThreadListStackTraces` on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 This pull request has now been integrated. Changeset: 5e544f15 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/5e544f15100366f4e2db58cb0e28cdfd292fe35f Stats: 43 lines in 3 files changed: 17 ins; 19 del; 7 mod 8329491: GetThreadListStackTraces function should use JvmtiHandshake Reviewed-by: pchilanomate, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/18574 From amitkumar at openjdk.org Thu Apr 11 05:01:04 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 11 Apr 2024 05:01:04 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses Message-ID: It's trivial update to use `testbit` method to test the bit "in-memory" ------------- Commit messages: - use testbit Changes: https://git.openjdk.org/jdk/pull/18709/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18709&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330008 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18709.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18709/head:pull/18709 PR: https://git.openjdk.org/jdk/pull/18709 From bulasevich at openjdk.org Thu Apr 11 05:08:48 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 11 Apr 2024 05:08:48 GMT Subject: RFR: 8241503: C2: Share MacroAssembler between mach nodes during code emission [v12] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 23:47:32 GMT, Cesar Soares Lucas wrote: >> src/hotspot/cpu/arm/arm.ad line 1877: >> >>> 1875: %} >>> 1876: >>> 1877: // Pointer Immediate >> >> Why do you introduce immN operands for arm32? > > This was accidental. I'll remove it. Ok. The ARM32 changes look good to me now! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1560439593 From aboldtch at openjdk.org Thu Apr 11 05:37:05 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Apr 2024 05:37:05 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v2] In-Reply-To: References: Message-ID: > `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. > > This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. > > It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. > > Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. > > Tests still running. Tier 1-5 done, Tier 6-7 running. Axel Boldt-Christmas has updated the pull request incrementally with three additional commits since the last revision: - Drop includes - Strengthen waiting_monitor assert - Add explicit break ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18715/files - new: https://git.openjdk.org/jdk/pull/18715/files/e11d9b04..d2e8216d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=00-01 Stats: 4 lines in 2 files changed: 1 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18715.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18715/head:pull/18715 PR: https://git.openjdk.org/jdk/pull/18715 From aboldtch at openjdk.org Thu Apr 11 05:37:05 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Apr 2024 05:37:05 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v2] In-Reply-To: References: Message-ID: <-U08pv-xNCW8WtiIUHEapovxTatlCE8JPWCYSjf7g_4=.0cac4742-7c17-4123-84d4-718b44fa74e8@github.com> On Thu, 11 Apr 2024 03:25:36 GMT, Patricio Chilano Mateo wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with three additional commits since the last revision: >> >> - Drop includes >> - Strengthen waiting_monitor assert >> - Add explicit break > > src/hotspot/share/runtime/deoptimization.cpp line 72: > >> 70: #include "runtime/fieldDescriptor.inline.hpp" >> 71: #include "runtime/frame.inline.hpp" >> 72: #include "runtime/globals.hpp" > > Is this extra include needed? Not with the current state of the codebase. Currently we get `"utilities/globalDefinitions.hpp"` and `"runtime/globals.hpp"` via the following include chains: #include "classfile/javaClasses.inline.hpp" #include "classfile/javaClasses.hpp" #include "classfile/vmClasses.hpp" #include "classfile/vmClassID.hpp" #include "utilities/enumIterator.hpp" #include "metaprogramming/primitiveConversions.hpp" #include "utilities/globalDefinitions.hpp" #include "runtime/handles.hpp" #include "memory/arena.hpp" #include "runtime/globals.hpp" But in general, especially for `.cpp` files, I like to include directly what is used (except for system headers like `cstdint` and `cstddef` for which I include `utilities/globalDefinitions.hpp`). All this is to have removals elsewhere not break the includes (it can obviously still break things due to updated include order if we do nasty things with the preprocessor etc.) However I know that there is sometimes pushback on both `globals.hpp` and `globalDefinitions.hpp` because for almost any non-trivial compilation unit (that is not utility code) they will get included from some of its leaf dependencies. I'll drop these includes in this PR. But that is my rational for including them ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1560454775 From fbredberg at openjdk.org Thu Apr 11 05:50:43 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Thu, 11 Apr 2024 05:50:43 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v6] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Thu, 11 Apr 2024 01:27:15 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix typos, copyrights and add comment Looks good to me. ------------- Marked as reviewed by fbredberg (Committer). PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1993325386 From stuefe at openjdk.org Thu Apr 11 06:07:44 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Apr 2024 06:07:44 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 16:31:08 GMT, Ioi Lam wrote: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. Looks good! ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18728#pullrequestreview-1993342509 From syan at openjdk.org Thu Apr 11 06:26:41 2024 From: syan at openjdk.org (SendaoYan) Date: Thu, 11 Apr 2024 06:26:41 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: On Tue, 12 Mar 2024 09:06:45 GMT, SendaoYan wrote: > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao Fix the testcase bug, the risk is low. Can anyone reivew this PR, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2049000821 From stuefe at openjdk.org Thu Apr 11 06:42:46 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Apr 2024 06:42:46 GMT Subject: RFR: 8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: <8jzpxthAd2ttVYNLj0u5Yx9IG3rLgGSkxXe7byC9X4o=.7991d130-aff2-4045-88b5-ef4a22a20bd7@github.com> On Wed, 6 Dec 2023 08:13:55 GMT, Thomas Stuefe wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Add specific percentage switch not yet bot ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-2049019890 From stefank at openjdk.org Thu Apr 11 07:32:43 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 11 Apr 2024 07:32:43 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v9] In-Reply-To: References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: On Tue, 20 Feb 2024 05:27:22 GMT, Denghui Dong wrote: >> There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing. >> >> Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production. >> >> ~~This patch adds a new JFR event `EventSafepointTimeout` to record the threads that cause ttsp too long.~~ >> >> ~~This event includes two fields:~~ >> >> ~~- safepointId: the relevant safepoint id~~ >> ~~- timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint~~ >> >> ~~In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.~~ >> >> A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR. >> >> Any input would be greatly appreciated. >> >> Testing: jdk/jdk/jfr > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > delete _entries when disabled src/hotspot/share/jfr/support/jfrTimeToSafepoint.inline.hpp line 32: > 30: > 31: #include "runtime/safepoint.hpp" > 32: #include "runtime/vmThread.hpp" Suggestion: #include "jfr/support/jfrTimeToSafepoint.hpp" #include "jfr/jfrEvents.hpp" #include "jfr/recorder/jfrEventSetting.inline.hpp" #include "runtime/safepoint.hpp" #include "runtime/vmThread.hpp" src/hotspot/share/runtime/safepoint.cpp line 74: > 72: #include "utilities/systemMemoryBarrier.hpp" > 73: > 74: #if INCLUDE_JFR Blankline at 73 should be removed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17888#discussion_r1560557994 PR Review Comment: https://git.openjdk.org/jdk/pull/17888#discussion_r1560559404 From ddong at openjdk.org Thu Apr 11 07:51:33 2024 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 11 Apr 2024 07:51:33 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v10] In-Reply-To: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: <0cAfLO-5gQvsN4voxFbx9eR0_ugZLo4A7gtAOTbQSTU=.28b1c7aa-72c6-4365-af9e-c49a11c3d4b2@github.com> > There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing. > > Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production. > > ~~This patch adds a new JFR event `EventSafepointTimeout` to record the threads that cause ttsp too long.~~ > > ~~This event includes two fields:~~ > > ~~- safepointId: the relevant safepoint id~~ > ~~- timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint~~ > > ~~In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.~~ > > A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR. > > Any input would be greatly appreciated. > > Testing: jdk/jdk/jfr Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17888/files - new: https://git.openjdk.org/jdk/pull/17888/files/75ca854a..f5e25e38 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17888&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17888&range=08-09 Stats: 5 lines in 2 files changed: 2 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17888/head:pull/17888 PR: https://git.openjdk.org/jdk/pull/17888 From ddong at openjdk.org Thu Apr 11 07:51:34 2024 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 11 Apr 2024 07:51:34 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v9] In-Reply-To: References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: <1y6a-3k8aNA9LYSof4oPqEC0jRlyngeEQfBQQeGG1ss=.24e5a18a-3dbf-4db8-9260-10ac29ddc5e8@github.com> On Thu, 11 Apr 2024 07:29:18 GMT, Stefan Karlsson wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> delete _entries when disabled > > src/hotspot/share/jfr/support/jfrTimeToSafepoint.inline.hpp line 32: > >> 30: >> 31: #include "runtime/safepoint.hpp" >> 32: #include "runtime/vmThread.hpp" > > Suggestion: > > #include "jfr/support/jfrTimeToSafepoint.hpp" > > #include "jfr/jfrEvents.hpp" > #include "jfr/recorder/jfrEventSetting.inline.hpp" > #include "runtime/safepoint.hpp" > #include "runtime/vmThread.hpp" Updated. > src/hotspot/share/runtime/safepoint.cpp line 74: > >> 72: #include "utilities/systemMemoryBarrier.hpp" >> 73: >> 74: #if INCLUDE_JFR > > Blankline at 73 should be removed Removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17888#discussion_r1560587873 PR Review Comment: https://git.openjdk.org/jdk/pull/17888#discussion_r1560587955 From rrich at openjdk.org Thu Apr 11 08:33:43 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 11 Apr 2024 08:33:43 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v5] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Thu, 11 Apr 2024 01:15:26 GMT, David Holmes wrote: > In the general case the parent held count might be non-zero, but with vthreads and pinning it has to be zero. With Patricios work in the loom repo we can reach here also when the vthread owns a Java monitor. I see that if the monitor was entered using JNI that this still prevents context switching (https://github.com/openjdk/loom/blob/09e4329fea5e3908855fa0881f156b7fa300a533/src/hotspot/share/runtime/continuationFreezeThaw.cpp#L1839-L1840). I somehow didn't expect this but after thinking about it twice I think it makes sense. So reaching here with a non-zero JNI monitor count this means that the vthread is terminating. The following line could be improved to reflect this better if you want. If the held monitor count is > 0 and this vthread is terminating then ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18445#discussion_r1560641317 From rrich at openjdk.org Thu Apr 11 08:37:48 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 11 Apr 2024 08:37:48 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v6] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Thu, 11 Apr 2024 01:27:15 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix typos, copyrights and add comment Thanks for fixing this and taking care of ppc. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18445#pullrequestreview-1993600918 From duke at openjdk.org Thu Apr 11 08:44:20 2024 From: duke at openjdk.org (kuaiwei) Date: Thu, 11 Apr 2024 08:44:20 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v5] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request incrementally with one additional commit since the last revision: Cleanup unused _last_label_code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18467/files - new: https://git.openjdk.org/jdk/pull/18467/files/1a49c60c..57311189 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=03-04 Stats: 12 lines in 3 files changed: 0 ins; 9 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From aph at openjdk.org Thu Apr 11 09:11:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Apr 2024 09:11:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Fri, 5 Apr 2024 12:17:17 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - disable unused-function warnings; add log msg > - minor > Nice work, Hamlin and Xiaohong. I'm glad to see progress on incorporating SLEEF library into the JDK. (Somehow I > From engineering perspective, I believe that bundling vector math library with the JDK is the right thing to do, but it doesn't imply the sources should be part of JDK. There are already examples of optional dependencies on external native libraries in HotSpot (e.g., hsdis tool w/ binutils, capstone, and llvm backends). No, it doesn't imply that the sources should be part of JDK, but practical reasons to do with the way that OpenJDK is built and shipped by various parties strongly suggests that we should integrate the SLEEF library into the JDK source tree. If we don't, there will be skew between OpenJDK versions shipped by different vendors. Also, I believe that there is less work for all of us if we integrate rather than having communicate to everyone building the JDK. And finally, Mark Reinhold has stated that the JDK is not downstream of any other project. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2049256172 From sgehwolf at openjdk.org Thu Apr 11 09:35:44 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 11 Apr 2024 09:35:44 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 16:55:36 GMT, Severin Gehwolf wrote: > Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: > > > [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present > > > This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: > > > java -XshowSettings:system --version > Operating System Metrics: > Provider: cgroupv1 > System not containerized. > openjdk 23-internal 2024-09-17 > OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) > OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) > > > The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. > > Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. > > Testing: > > - [x] GHA (risc-v failure seems infra related) > - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) > - [x] Some manual testing using cri-o > > Thoughts? Gentle ping. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2049297883 From jkern at openjdk.org Thu Apr 11 09:48:56 2024 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 11 Apr 2024 09:48:56 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v8] In-Reply-To: References: Message-ID: <6-WH9_Hbwi2jmndNw-AfJUDEdEFd5wqtE1dDiB6lasQ=.d5f99193-75c8-4d50-8a77-7a7126bd5b2d@github.com> > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: my_disclaim64 already removed by other PR ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18536/files - new: https://git.openjdk.org/jdk/pull/18536/files/a8d85924..030de164 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=06-07 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From sgehwolf at openjdk.org Thu Apr 11 09:53:42 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 11 Apr 2024 09:53:42 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: On Tue, 12 Mar 2024 09:06:45 GMT, SendaoYan wrote: > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao `--memory-swappiness` is cgv1 specific and needs to be handled for the cgv2 case. test/hotspot/jtreg/containers/docker/TestJFREvents.java line 225: > 223: .addDockerOpts("--memory-swap=" + swapValueToSet) > 224: //The default memory-swappiness vaule is inherited from the host machine, which maybe 0 > 225: .addDockerOpts("--memory-swappiness=60") Nit: Space after `//`. `--memory-swappiness` is cgroup v1 (legacy specific): $ podman run --rm -ti --memory-swappiness=60 fedora:39 Error: OCI runtime error: crun: cannot set memory swappiness with cgroupv2 Therefore, we need to ensure we are running on cgroups v1 when we add that option. ------------- Changes requested by sgehwolf (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18225#pullrequestreview-1993768230 PR Review Comment: https://git.openjdk.org/jdk/pull/18225#discussion_r1560753959 From mli at openjdk.org Thu Apr 11 10:36:03 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Apr 2024 10:36:03 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: > Hi, > Can you help to review the patch? > This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). > > Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. > Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. > > Besides of the code changes, one important task is to handle the legal process. > > Thanks! Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: fix performance issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18605/files - new: https://git.openjdk.org/jdk/pull/18605/files/34529ff1..cd70f5a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=01-02 Stats: 4 lines in 4 files changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18605.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18605/head:pull/18605 PR: https://git.openjdk.org/jdk/pull/18605 From mli at openjdk.org Thu Apr 11 10:38:42 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Apr 2024 10:38:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Tue, 9 Apr 2024 20:10:36 GMT, Mikael Vidstedt wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - disable unused-function warnings; add log msg >> - minor > > Thank you for the update and for working on this in general. > > I've started working on JDK-8329816, preparing the change for the SLEEF specific part of the change. Specifically, I'm currently planning on including the three SLEEF header files, the README and a legal/sleef.md file in that change. Let me know if you have any thoughts/concerns. > > Also, just for my understanding, would love to understand your thoughts on the future here (I apologize if this was already discussed elsewhere): > > It seem like SLEEF is (sort of) limited to linux at this point (the SLEEF README mentions that "Due to limited test capacities, SLEEF is currently only officially supported on Linux with gcc or llvm/clang." ). That same README does, however, indicate good test coverage on several architectures in addition to aarch64 (including x86_64, PPC, RISC-V). With that in mind, it looks like we could potentially use SLEEF for other architectures on linux in the future? And potentially additional operating systems as well? Hey, @vidmik I've fixed the performance issue, and update the sleef inline headers and README. It's good for you to integrate these files via JDK-8329816. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2049400793 From mli at openjdk.org Thu Apr 11 10:45:46 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Apr 2024 10:45:46 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Thu, 11 Apr 2024 10:36:03 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix performance issue Thanks everyone for discussion about the direction (integrate source or lib). We did have some implementation for integrating sleef lib into jdk, but seems previously the most strong opinion is to integrate the sleef source into jdk. I know there are cons and pros for every solution, but I will stick to current solution unless everyone can reach another agreement. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2049410731 From aboldtch at openjdk.org Thu Apr 11 11:01:10 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Apr 2024 11:01:10 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v3] In-Reply-To: References: Message-ID: <14wFn2FPMeokML2Uo-xi1YVB0wHJcw4KXDvGojouU9Q=.8b108c98-4967-4445-8012-c1eaa742a75b@github.com> > `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. > > This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. > > It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. > > Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. > > Tests still running. Tier 1-5 done, Tier 6-7 running. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Include sort order ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18715/files - new: https://git.openjdk.org/jdk/pull/18715/files/d2e8216d..12d112bc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=01-02 Stats: 6 lines in 1 file changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18715.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18715/head:pull/18715 PR: https://git.openjdk.org/jdk/pull/18715 From mli at openjdk.org Thu Apr 11 11:31:44 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 11 Apr 2024 11:31:44 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Thu, 11 Apr 2024 10:36:03 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! >> >> ## Performance >> NOTE: >> * `Src` means implementation in this pr, i.e. without depenency on external sleef. >> * `Disabled` means disable intrinsics by `-XX:-UseVectorStubs` >> * `system_sleef` means implementation in [previous pr 18294](https://github.com/openjdk/jdk/pull/18294), i.e. build and run jdk with depenency on external sleef. >> >> Basically, the perf data below shows that >> * this implementation has better performance than previous version in [pr 18294](https://github.com/openjdk/jdk/pull/18294), >> * and both sleef versions has much better performance compared with non-sleef version. >> >> |Benchmark |(size)|Src |Units|system_sleef|(system_sleef-Src)/Src|Diabled |(Disable-Src)/Src| >> |------------------------------|------|---------|-----|------------|----------------------|---------|-----------------| >> |3472:Double128Vector.ACOS |1024 |8546.842 |ns/op|8516.007 |-0.004 |16799.273|0.966 | >> |3473:Double128Vector.ASIN |1024 |6864.656 |ns/op|6987.328 |0.018 |16602.442|1.419 | >> |3474:Double128Vector.ATAN |1024 |11489.255|ns/op|12261.800 |0.067 |26329.320|1.292 | >> |3475:Double128Vector.ATAN2 |1024 |16661.170|ns/op|17234.472 |0.034 |42084.100|1.526 | >> |3476:Double128Vector.CBRT |1024 |18999.387|ns/op|20298.458 |0.068 |35998.688|0.895 | >> |3477:Double128Vector.COS |1024 |14081.857|ns/op|14846.117 |0.054 |24420.692|0.734 | >> |3478:Double128Vector.COSH |1024 |12202.306|ns/op|12237.772 |0.003 |21343.863|0.749 | >> |3479:Double128Vector.EXP |1024 |4553.108 |ns/op|4777.638 ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix performance issue I've also updated the pr description with performance data, it shows that * this implementation has better performance than previous version in https://github.com/openjdk/jdk/pull/18294, * and both sleef versions has much better performance compared with non-sleef version. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2049482861 From sgehwolf at openjdk.org Thu Apr 11 12:08:02 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 11 Apr 2024 12:08:02 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: > Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: > > > [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present > > > This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: > > > java -XshowSettings:system --version > Operating System Metrics: > Provider: cgroupv1 > System not containerized. > openjdk 23-internal 2024-09-17 > OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) > OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) > > > The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. > > Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. > > Testing: > > - [x] GHA (risc-v failure seems infra related) > - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) > - [x] Some manual testing using cri-o > > Thoughts? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Merge branch 'master' into jdk-8261242-is-containerized-fix - jcheck fixes - Fix tests - Implement Metrics.isContainerized() - Some clean-up - Drop cgroups testing on plain Linux - Implement fall-back logic for non-ro controller mounts - Make find_ro static and local to compilation unit - 8261242: [Linux] OSContainer::is_containerized() returns true ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18201/files - new: https://git.openjdk.org/jdk/pull/18201/files/98325f18..0df26ebd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18201&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18201&range=00-01 Stats: 407791 lines in 3887 files changed: 43423 ins; 33650 del; 330718 mod Patch: https://git.openjdk.org/jdk/pull/18201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18201/head:pull/18201 PR: https://git.openjdk.org/jdk/pull/18201 From eosterlund at openjdk.org Thu Apr 11 13:23:45 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 11 Apr 2024 13:23:45 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v4] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 17:40:14 GMT, Coleen Phillimore wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > More comment updates. Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18375#pullrequestreview-1994173350 From tstuefe at redhat.com Thu Apr 11 13:24:18 2024 From: tstuefe at redhat.com (Thomas Stuefe) Date: Thu, 11 Apr 2024 15:24:18 +0200 Subject: CFV: New HotSpot Group Member: Andrew Dinn Message-ID: Hi, I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. Votes are due by April 25, 2024. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Cheers, Thomas [1] https://github.com/openjdk/jdk/commits/master/?author=adinn [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot [3] https://openjdk.org/census#members [4] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From eosterlund at openjdk.org Thu Apr 11 13:27:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 11 Apr 2024 13:27:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> References: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> Message-ID: On Mon, 8 Apr 2024 14:59:47 GMT, Patricio Chilano Mateo wrote: > > In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). > > So if the race happens only when resetting the stackChunk values when thawing the last frame, wouldn't it be enough to avoid clearing the argsize there? Because if we read the new sp when creating the stack frame iterator, regardless of the argsize value read, is_done() will be true so we won't iterate any frame. I'm trying to understand if the new model is needed to fix the race or that is part of a cleanup/refactoring. I thought about going in that direction as well. But in the end, I found it to be more direct and easy to understand if the property being tracked directly in the stack chunk is bottom, and argsize is the computed property, as opposed to the other way around. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2049689256 From eosterlund at openjdk.org Thu Apr 11 13:27:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 11 Apr 2024 13:27:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> Message-ID: <987CItnCGZYUIogG0P6L0aY5pqw-SqBx-VEpKzy5qGk=.244d30bf-a6d6-4c6e-b806-5deae22cb3ae@github.com> On Mon, 8 Apr 2024 15:01:01 GMT, Patricio Chilano Mateo wrote: > > Unlike thawing, the freeze operation does not race with the GC by design. > > Is this with the changes in the allocation code in this patch or even before those there was no race? That was the case before this change as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2049689824 From ChrisPhi at LGonQn.Org Thu Apr 11 13:43:26 2024 From: ChrisPhi at LGonQn.Org (Chris Phillips) Date: Thu, 11 Apr 2024 09:43:26 -0400 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Hi, Vote: yes Cheers! Chris PS? I would have expected Andrew Dinn to have been a HotSpot Group member from long ago. I almost remember voting him in? On 2024-04-11 09:24, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He > has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves > and repo consolidations over the years [1], but he was one of the > original authors of the arm64 port ([2] shows 359 changes in the > mercurial hotspot sub repository alone), contributed JEP 352 (support > for NVM devices under byte buffers), and more recently has been active > in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]https://github.com/openjdk/jdk/commits/master/?author=adinn > > [2]https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > > [3]https://openjdk.org/census#members > [4]https://openjdk.org/groups/#member-vote > > From maxim.kartashev at jetbrains.com Thu Apr 11 14:05:44 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Thu, 11 Apr 2024 18:05:44 +0400 Subject: RFO: a tool to analyze HotSpot fatal error logs Message-ID: Hello, I am writing to inquire about the potential interest of the people involved in inspecting HotSpot crashes in a tool aimed at facilitating that inspection. We at JetBrains have developed an internal plugin that helps both with filtering through dozens of reports quickly in order to find a pattern and for diving deep into a particular crash. In addition to the "standard" features such as syntax highlighting, folding, and structural navigation, it will * highlight potential problems such as overloaded CPU, low physical memory, the presence of OOME in the recent exceptions, LD_LIBRARY_PATH being set, etc, * generate an "executive summary" for a high-level overview, for example, by front-line support, * pop up a tooltip for any recognized address describing its origin (for example, if it belongs to some thread's stack, the Java heap, a register, or a memory-mapped region), * provide the ability to highlight all addresses "near" the selected address, including registers, threads, and memory-mapped regions. If there is sufficient interest in creating a public and/or open-source variant of this internal plugin, I will pitch the idea to my employer. It shouldn't be too much work to create a public version. Kind regards, Maxim. References: * https://docs.oracle.com/javase/10/troubleshoot/fatal-error-log.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at openjdk.org Thu Apr 11 14:21:48 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Apr 2024 14:21:48 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 15:41:41 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 78 commits: > > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - Merge branch 'clean' into JDK-8180450 > - InlineSecondarySupersTest is on by default. > - InlineSecondarySupersTest is on by default. > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - JDK-8180450: secondary_super_cache does not scale well > - ... and 68 more: https://git.openjdk.org/jdk/compare/b80ba085...8dc2ac13 > I see new assertion failures on windows-x64 w/ generational ZGC (`-XX:+UseZGC -XX:+ZGenerational`): > > Looks like we are running out of space for stubs. The additional size allowance for ZGC stubs is too low, even on Linux, where it actually needs an extra 21088 bytes. I'll bump the addition to 24000. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2049804245 From aph at openjdk.org Thu Apr 11 14:27:44 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Apr 2024 14:27:44 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v9] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 19:04:27 GMT, Vladimir Ivanov wrote: > I'm happy with the current state of the patch. Thanks a lot for incorporating the changes I proposed. I find it easier to reason about the implementation now and hope it'll help others navigating in the code. > > > ... new lookup code is substantially larger than before, particularly for x86, and this might in some cases change inlining behaviour and cause regressions. To ameliorate that I've added another option, -XX:-InlineSecondarySupersTest, which generates stubs for the search code. While that does solve the code expansion problem, the additional call&return overhead doubles the time for each lookup, so I'm reluctant to recommend it for general use. > > Alternatively, C2 inlining heuristics can be taught to discount inlined part of secondary supers lookup in generated code. It was the remediation chosen [1] for regressions introduced by post-call nops (part of Loom support). Oh, cool. It feels a bit like cheating, although it does make sense. It's probably not right for backports, though. > I checked that CDS support works fine with the latest PR. OK, good > I took a look at transitive interface sharing and it turned out to be a bit more complicated than a SA-specific test issue. OK, thanks. I'll leave that one for now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2049817606 From daniel.daugherty at oracle.com Thu Apr 11 14:29:26 2024 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Thu, 11 Apr 2024 10:29:26 -0400 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <2117fb5a-661f-49af-8d31-86c3743c1425@oracle.com> Vote: yes Dan On 4/11/24 9:24 AM, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. > He has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves > and repo consolidations over the years [1], but he was one of the > original authors of the arm64 port ([2] shows 359 changes in the > mercurial hotspot sub repository alone), contributed JEP 352 (support > for NVM devices under byte buffers), and more recently has been active > in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]https://github.com/openjdk/jdk/commits/master/?author=adinn > [2]https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > [3]https://openjdk.org/census#members > [4]https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Thu Apr 11 14:31:17 2024 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 11 Apr 2024 14:31:17 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes Best regards, Martin Von: hotspot-dev im Auftrag von Thomas Stuefe Datum: Donnerstag, 11. April 2024 um 15:30 An: hotspot-dev at openjdk.org Betreff: CFV: New HotSpot Group Member: Andrew Dinn Sie erhalten nicht oft eine E-Mail von tstuefe at redhat.com. Erfahren Sie, warum dies wichtig ist Hi, I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. Votes are due by April 25, 2024. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Cheers, Thomas [1] https://github.com/openjdk/jdk/commits/master/?author=adinn [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot [3] https://openjdk.org/census#members [4] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgibbons at openjdk.org Thu Apr 11 14:38:45 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 14:38:45 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v7] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Thu, 11 Apr 2024 00:38:11 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add movq to locate_operand > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5988: > >> 5986: movw(Address(to, 0), value); >> 5987: addptr(to, 2); >> 5988: subptr(count, 1<<(shift-1)); > > At line 5968 also we need the change from cmpl to cmpptr. > cmpl(count, 2< src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6050: > >> 6048: vpbroadcastd(xtmp, xtmp, Assembler::AVX_512bit); >> 6049: >> 6050: subptr(count, 16 << shift); > > At line 6045 also the cmpl should change to cmpptr: > cmpl(count, VM_Version::avx3_threshold()); Will do. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2497: > >> 2495: // >> 2496: address StubGenerator::generate_unsafe_setmemory(const char *name, >> 2497: address byte_fill_entry) { > > Need to add UnsafeSetMemoryMark on similar lines as UnsafeCopyMemoryMark to handle page error. Will do. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2522: > >> 2520: #define rScratch3 r8 >> 2521: #undef rScratch4 >> 2522: #define rScratch4 r11 > > We could do this setup using const Register declaration instead of using #undef/#define pair. We discussed this and the #define option was your preferred method since the registers are being re-used. Do you want this changed back again? > src/hotspot/share/opto/library_call.cpp line 4950: > >> 4948: >> 4949: bool LibraryCallKit::inline_unsafe_setMemory() { >> 4950: if (callee()->is_static()) return false; // caller must have the capability! > > Also need to return false if StubRoutines::unsafe_setmemory() == nullptr. Will do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128613 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128862 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561129186 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561128151 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561130102 From rkennke at amazon.de Thu Apr 11 14:44:09 2024 From: rkennke at amazon.de (Kennke, Roman) Date: Thu, 11 Apr 2024 14:44:09 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes > On Apr 11, 2024, at 3:24?PM, Thomas Stuefe wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1] https://github.com/openjdk/jdk/commits/master/?author=adinn > [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > [3] https://openjdk.org/census#members > [4] https://openjdk.org/groups/#member-vote Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 From aph at openjdk.org Thu Apr 11 14:46:16 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Apr 2024 14:46:16 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v10] In-Reply-To: References: Message-ID: <-YyJSfpYkWm_A02aF5a6MyeVZC0b_1hb8qHIi91EqAM=.63f9041e-f6c7-4346-a418-3bac606811ec@github.com> > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/8dc2ac13..a8b5f441 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=08-09 Stats: 5 lines in 2 files changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From sgibbons at openjdk.org Thu Apr 11 14:47:07 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 14:47:07 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v8] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Address review comments (#15) * Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/f81aaa9f..b0ac8577 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=06-07 Stats: 170 lines in 9 files changed: 147 ins; 11 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From asmehra at openjdk.org Thu Apr 11 14:52:41 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 11 Apr 2024 14:52:41 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 16:31:08 GMT, Ioi Lam wrote: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. src/hotspot/share/cds/archiveBuilder.hpp line 215: > 213: GrowableArray* _klasses; > 214: GrowableArray* _symbols; > 215: unsigned int _entropy_seed; Shouldn't it be a static const? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18728#discussion_r1561160618 From eosterlund at openjdk.org Thu Apr 11 14:56:47 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 11 Apr 2024 14:56:47 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 03:55:33 GMT, Joshua Zhu wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: > > Add more output for easy debugging once the jtreg test case fails This looks good to me and seems to follow a similar design to what I did on x86_64 vectors. Thanks for doing this! ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17977#pullrequestreview-1994438128 From stuefe at openjdk.org Thu Apr 11 15:03:45 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Apr 2024 15:03:45 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 16:31:08 GMT, Ioi Lam wrote: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. Thinking about this, since global entropy (archived object ihashes) sneak into archives whether we use local seeds or not, maybe we should not bother with such a patch. In other words, if global state affects the archive anyway, we may just as well roll with it. See https://github.com/openjdk/jdk/pull/18735 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18728#issuecomment-2049897850 From stuefe at openjdk.org Thu Apr 11 15:03:46 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Apr 2024 15:03:46 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 14:50:28 GMT, Ashutosh Mehra wrote: >> The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. >> >> When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. > > src/hotspot/share/cds/archiveBuilder.hpp line 215: > >> 213: GrowableArray* _klasses; >> 214: GrowableArray* _symbols; >> 215: unsigned int _entropy_seed; > > Shouldn't it be a static const? I think it is meant to be a member of the (one and only existing) archive builder. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18728#discussion_r1561181874 From syan at openjdk.org Thu Apr 11 15:23:55 2024 From: syan at openjdk.org (SendaoYan) Date: Thu, 11 Apr 2024 15:23:55 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v2] In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: <64TqRDSmdA9mwFdNdArzg89V_EHc6GWFE1aw_p_vf-Q=.946a90c1-2ce6-4c08-9936-3109d3676489@github.com> > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao SendaoYan has updated the pull request incrementally with one additional commit since the last revision: add a space before // ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18225/files - new: https://git.openjdk.org/jdk/pull/18225/files/480f3364..4a9f3881 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18225.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18225/head:pull/18225 PR: https://git.openjdk.org/jdk/pull/18225 From cslucas at openjdk.org Thu Apr 11 15:36:48 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 11 Apr 2024 15:36:48 GMT Subject: RFR: 8241503: C2: Share MacroAssembler between mach nodes during code emission [v13] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 23:53:00 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused operands in arm.ad Thank you all for reviewing!! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2049976529 From syan at openjdk.org Thu Apr 11 15:38:43 2024 From: syan at openjdk.org (SendaoYan) Date: Thu, 11 Apr 2024 15:38:43 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v2] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: On Thu, 11 Apr 2024 09:50:57 GMT, Severin Gehwolf wrote: >> SendaoYan has updated the pull request incrementally with one additional commit since the last revision: >> >> add a space before // > > test/hotspot/jtreg/containers/docker/TestJFREvents.java line 225: > >> 223: .addDockerOpts("--memory-swap=" + swapValueToSet) >> 224: //The default memory-swappiness vaule is inherited from the host machine, which maybe 0 >> 225: .addDockerOpts("--memory-swappiness=60") > > Nit: Space after `//`. > > `--memory-swappiness` is cgroup v1 (legacy specific): > > $ podman run --rm -ti --memory-swappiness=60 fedora:39 > Error: OCI runtime error: crun: cannot set memory swappiness with cgroupv2 > > > Therefore, we need to ensure we are running on cgroups v1 when we add that option. Thanks for your review. The space after `//` has been added. I can't reproduce the "OCI runtime error" failure on mine ubuntu22 environment. It seems that ubuntu22 use cgroups v2 by default. ![image](https://github.com/openjdk/jdk/assets/24123821/d41934fe-afb4-45a7-abd4-df4070123bb2) Can you show your host machine enviroment information, so I can reproduce the same failure. After that I will try to find a solution with cgroupv2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18225#discussion_r1561236762 From aph at openjdk.org Thu Apr 11 15:44:02 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Apr 2024 15:44:02 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v11] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/a8b5f441..9b98662a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From cslucas at openjdk.org Thu Apr 11 15:47:53 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 11 Apr 2024 15:47:53 GMT Subject: Integrated: 8241503: C2: Share MacroAssembler between mach nodes during code emission In-Reply-To: References: Message-ID: On Thu, 2 Nov 2023 22:17:43 GMT, Cesar Soares Lucas wrote: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. This pull request has now been integrated. Changeset: 31ee5108 Author: Cesar Soares Lucas Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/31ee5108e059afae0a3809947adb7b91e19baec6 Stats: 2144 lines in 60 files changed: 118 ins; 431 del; 1595 mod 8241503: C2: Share MacroAssembler between mach nodes during code emission Reviewed-by: kvn, mdoerr, amitkumar, lucy ------------- PR: https://git.openjdk.org/jdk/pull/16484 From mdoerr at openjdk.org Thu Apr 11 15:48:41 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 11 Apr 2024 15:48:41 GMT Subject: RFR: 8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: <_i49UR88qDjvVkdjvT9EBWpr3MisDWzjPDv8esNFOV8=.ca937dbb-8f47-4689-befe-24cb0a2240c8@github.com> On Tue, 9 Apr 2024 07:31:23 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > adjust typo I think it's ok if @stefank also likes it. ------------- PR Review: https://git.openjdk.org/jdk/pull/18626#pullrequestreview-1994582499 From sgehwolf at openjdk.org Thu Apr 11 15:52:45 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 11 Apr 2024 15:52:45 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v2] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: <3lEEmi-4SOpmVy_SqInD8q1BReMYdmhBqszkskNnqbk=.602870f7-d94b-4b33-9b8a-35002bfef4f3@github.com> On Thu, 11 Apr 2024 15:36:31 GMT, SendaoYan wrote: >> test/hotspot/jtreg/containers/docker/TestJFREvents.java line 225: >> >>> 223: .addDockerOpts("--memory-swap=" + swapValueToSet) >>> 224: //The default memory-swappiness vaule is inherited from the host machine, which maybe 0 >>> 225: .addDockerOpts("--memory-swappiness=60") >> >> Nit: Space after `//`. >> >> `--memory-swappiness` is cgroup v1 (legacy specific): >> >> $ podman run --rm -ti --memory-swappiness=60 fedora:39 >> Error: OCI runtime error: crun: cannot set memory swappiness with cgroupv2 >> >> >> Therefore, we need to ensure we are running on cgroups v1 when we add that option. > > Thanks for your review. The space after `//` has been added. > > I can't reproduce the "OCI runtime error" failure on mine ubuntu22 environment. > It seems that ubuntu22 use cgroups v2 by default. > ![image](https://github.com/openjdk/jdk/assets/24123821/d41934fe-afb4-45a7-abd4-df4070123bb2) > > Can you show your host machine enviroment information, so I can reproduce the same failure. After that I will try to find a solution with cgroupv2. It seems to be podman runtime specific. `crun` fails, `runc` doesn't seem to be. Either way, the corresponding interface file, `memory.swappiness` doesn't exist for cgroup v2. Try `podman run --runtime /usr/bin/crun --rm -ti --memory-swappiness=60 fedora:39` provided the `crun` runtime is installed in `/usr/bin`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18225#discussion_r1561254854 From pchilanomate at openjdk.org Thu Apr 11 15:54:42 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 11 Apr 2024 15:54:42 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v3] In-Reply-To: <14wFn2FPMeokML2Uo-xi1YVB0wHJcw4KXDvGojouU9Q=.8b108c98-4967-4445-8012-c1eaa742a75b@github.com> References: <14wFn2FPMeokML2Uo-xi1YVB0wHJcw4KXDvGojouU9Q=.8b108c98-4967-4445-8012-c1eaa742a75b@github.com> Message-ID: On Thu, 11 Apr 2024 11:01:10 GMT, Axel Boldt-Christmas wrote: >> `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. >> >> This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. >> >> It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. >> >> Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. >> >> Tests still running. Tier 1-7 done. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Include sort order Thanks, looks good to me. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18715#pullrequestreview-1994599881 From jesper.wilhelmsson at oracle.com Thu Apr 11 15:57:02 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Thu, 11 Apr 2024 15:57:02 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <903A2985-49E9-4241-BC25-D7CA1AAFD171@oracle.com> Vote: Yes /Jesper > On Apr 11, 2024, at 15:24, Thomas Stuefe wrote: > > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1] https://github.com/openjdk/jdk/commits/master/?author=adinn > [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > [3] https://openjdk.org/census#members > [4] https://openjdk.org/groups/#member-vote > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From azafari at openjdk.org Thu Apr 11 15:59:50 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 15:59:50 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API Message-ID: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. ------------- Commit messages: - 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API - some missed changes - reserve memory functions are also updated to have mandatory MEMFLAGS. - uncommit has also mandatory MEMFLAGS arg - virtual memory commit has mandatory MEMFLAGS arg. Changes: https://git.openjdk.org/jdk/pull/18745/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330076 Stats: 299 lines in 59 files changed: 13 ins; 39 del; 247 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From stuefe at openjdk.org Thu Apr 11 16:02:41 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Apr 2024 16:02:41 GMT Subject: RFR: 8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Tue, 9 Apr 2024 07:31:23 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > adjust typo Hmm, I think the separation makes sense, but I don't like the "frequent" moniker. All other event logs are separated by thematic area. "frequent" is orthogonal to that. I can have frequent/non-frequent class loading messages, or exceptions. My proposal would be either to drop these memory protection events (do we need them? or are they remnants of some old support issues?) or to put them into a 'memprot' section or similar. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2050026608 From ChrisPhi at LGonQn.Org Thu Apr 11 16:09:11 2024 From: ChrisPhi at LGonQn.Org (Chris Phillips) Date: Thu, 11 Apr 2024 12:09:11 -0400 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: <65deaef6-e039-8e90-4ec9-36ef652f4fd3@LGonQn.Org> Hi Vote: Yes Cheers! Chris On 2024-04-10 08:24, Jesper Wilhelmsson wrote: > I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. > > Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From azafari at openjdk.org Thu Apr 11 16:10:10 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 16:10:10 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: fixed missing change. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/c45aa21c..1f7d6fbb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From stuefe at openjdk.org Thu Apr 11 16:10:11 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 11 Apr 2024 16:10:11 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 15:54:38 GMT, Afshin Zafari wrote: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. The general thrust of this is okay, and I think it makes sense. It also combines nicely with the VMATree work Johan does. One question though, why do you want to abolish default values for MEMFLAGS? Allowing default values would reduce this patch by quite a bit, and make backporting less painful. Another idea: To alleviate the need to pass MEMFLAGS all the time, could we have something like a "active MEMFLAGS" state per Thread, and set that stack-based with a XXMark object? That way, one could say at the entrance of Metaspace, for instance, "whatever is allocated under the scope of this function, please mark with mtMetaspace". Just an idea. Otherwise, I think this is good, and thanks for doing this onerous work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2050037989 From ChrisPhi at LGonQn.Org Thu Apr 11 16:11:03 2024 From: ChrisPhi at LGonQn.Org (Chris Phillips) Date: Thu, 11 Apr 2024 12:11:03 -0400 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: Hi, Vote: Yes Cheers! Chris On 2024-04-10 08:24, Jesper Wilhelmsson wrote: > I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. > > Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From stefank at openjdk.org Thu Apr 11 16:11:43 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 11 Apr 2024 16:11:43 GMT Subject: RFR: 8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Thu, 11 Apr 2024 16:00:05 GMT, Thomas Stuefe wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> adjust typo > > Hmm, I think the separation makes sense, but I don't like the "frequent" moniker. All other event logs are separated by thematic area. "frequent" is orthogonal to that. I can have frequent/non-frequent class loading messages, or exceptions. > > My proposal would be either to drop these memory protection events (do we need them? or are they remnants of some old support issues?) or to put them into a 'memprot' section or similar. I agree with @tstuefe ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2050042155 From azafari at openjdk.org Thu Apr 11 16:21:52 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 16:21:52 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v3] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: fixed shenandoah missed changes. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/1f7d6fbb..b009556e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Thu Apr 11 16:24:41 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 16:24:41 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 16:06:35 GMT, Thomas Stuefe wrote: > One question though, why do you want to abolish default values for MEMFLAGS? Allowing default values would reduce this patch by quite a bit, and make backporting less painful. To be sure that all the calls without MEMFLAGS are changed in PR, Ii made it mandatory to let compiler find all the instances. There are some cases hidden or not found easily if I let the arg be optional. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2050063555 From azafari at openjdk.org Thu Apr 11 16:30:40 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 16:30:40 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 16:06:35 GMT, Thomas Stuefe wrote: > Another idea: To alleviate the need to pass MEMFLAGS all the time, could we have something like a "active MEMFLAGS" state per Thread, and set that stack-based with a XXMark object? That way, one could say at the entrance of Metaspace, for instance, "whatever is allocated under the scope of this function, please mark with mtMetaspace". Not sure if I understood your idea, the question is if a thread always uses only ONE type of memory and not mix of them? For example, CDS uses both mtClass and mtClassShared. If a Thread has an active MEMFLAG, it has to switch this flag between A and B whenever it uses type A or B. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2050072695 From pchilanomate at openjdk.org Thu Apr 11 16:34:46 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 11 Apr 2024 16:34:46 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: References: Message-ID: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> On Wed, 10 Apr 2024 04:21:23 GMT, Serguei Spitsyn wrote: >> The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: refactored to get rid of overloaded doit functions Looks good to me, just a few comments. src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 307: > 305: if (!JvmtiEnvBase::is_vthread_alive(target_h())) { > 306: return; // _completed remains false. > 307: } Do we need this? We already do this check in JvmtiHandshake::execute(). src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 309: > 307: } > 308: ResourceMark rm; > 309: javaVFrame *jvf = JvmtiEnvBase::get_vthread_jvf(target_h()); This method already handles both mounted and unmounted case, so do we need the first conditional above? src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 367: > 365: GetCurrentLocationClosure op; > 366: JvmtiHandshake::execute(&op, &tlh, thread, thread_h); > 367: Seems we are missing a JvmtiVTMSTransitionDisabler. ------------- PR Review: https://git.openjdk.org/jdk/pull/18630#pullrequestreview-1994658919 PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561295746 PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561298952 PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561300541 From stefank at openjdk.org Thu Apr 11 16:37:43 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 11 Apr 2024 16:37:43 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 16:10:10 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > fixed missing change. I think it is good that we are fixing this. I've on the verge of doing it myself a couple of times. I've left an initial set of comments below. It's not an exhaustive list, but I think it is a good set to start with. src/hotspot/os/bsd/gc/x/xPhysicalMemoryBacking_bsd.cpp line 81: > 79: > 80: // Reserve address space for backing memory > 81: _base = (uintptr_t)os::reserve_memory(max_capacity, false, mtJavaHeap); I think this should be using !MemExec and not a raw 'false' argument. With that said, I really think it would be best if we actually split os::reserve_memory into two distinct functions: 1) os::reserve_memory(max_capacity, mtJavaHeap) // Non-executable memory 2) os::reserve_executable_memory(max_capacity, mtCode) // Executable Maybe we could think about that in separate RFE. src/hotspot/os/bsd/gc/z/zPhysicalMemoryBacking_bsd.cpp line 82: > 80: > 81: // Reserve address space for backing memory > 82: _base = (uintptr_t)os::reserve_memory(max_capacity, false, mtJavaHeap); Use !MemExec - this goes for all other places in the patch that fills in the `exec` parameter. src/hotspot/os/linux/os_linux.cpp line 4684: > 4682: char* hint = (char*)(os::Linux::initial_thread_stack_bottom() - > 4683: (StackOverflow::stack_guard_zone_size() + page_size)); > 4684: char* codebuf = os::attempt_reserve_memory_at(hint, page_size, false, mtInternal); Should these be `mtInternal` or is there a `mtStack` that is more suitable? src/hotspot/os/windows/os_windows.cpp line 3137: > 3135: // If reservation failed, return null > 3136: if (p_buf == nullptr) return nullptr; > 3137: MemTracker::record_virtual_memory_reserve((address)p_buf, size_of_reserve, CALLER_PC, mtNone); Why is this (and the other places in this file) using `mtNone`? Shouldn't it at least be using `mtInternal`? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1414: > 1412: assert(SafepointSynchronize::is_at_safepoint(), "safe iteration is only available during safepoints"); > 1413: > 1414: if (!_aux_bitmap_region_special && !os::commit_memory((char*)_aux_bitmap_region.start(), _aux_bitmap_region.byte_size(), false, mtJavaHeap)) { Why is this `mtJavaHeap` and not `mtGC`? src/hotspot/share/gc/z/zMarkStackAllocator.cpp line 107: > 105: > 106: const uintptr_t shrink_start = _end - shrink_size; > 107: os::uncommit_memory((char*)shrink_start, shrink_size, mtGC, false /* executable */); `uncommit_memory` places the order of executable and flags differently to what we have for `commmit_memory_or_exit`. We might want to consider doing something about the order here. src/hotspot/share/memory/metaspace.cpp line 592: > 590: // Fallback: reserve anywhere > 591: log_debug(metaspace, map)("Trying anywhere..."); > 592: result = os::reserve_memory_aligned(size, Metaspace::reserve_alignment(), false, mtMetaspace); It's unclear to me if some of these `mtMetaspace` should be `mtClass`. This comment applies to other places where we're setting up memory for the compressed class space. src/hotspot/share/memory/virtualspace.cpp line 71: > 69: ReservedSpace::ReservedSpace(size_t size, > 70: size_t alignment, > 71: size_t page_size, MEMFLAGS flag, I think this function was written to have one argument per line. You should probably keep the style. I'm also unsure why this param is put as the next to last param instead of the last, as we do in many other places. src/hotspot/share/memory/virtualspace.cpp line 366: > 364: ReservedSpace space; > 365: space.initialize_members(base, size, alignment, page_size, special, executable); > 366: space.set_nmt_flag(flag); Why is this calling a set_nmt_flag instead of making initialize_member take a flag? src/hotspot/share/memory/virtualspace.cpp line 693: > 691: _special = false; > 692: _executable = false; > 693: _nmt_flag = mtNone; Weird indentation. src/hotspot/share/memory/virtualspace.hpp line 45: > 43: bool _special; > 44: int _fd_for_heap; > 45: MEMFLAGS _nmt_flag; Indentation is now off. src/hotspot/share/memory/virtualspace.hpp line 72: > 70: > 71: inline MEMFLAGS nmt_flag() { return _nmt_flag; } > 72: inline void set_nmt_flag(MEMFLAGS flag) { _nmt_flag = flag; } No need for the inline specifier here. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-1994644693 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561286933 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561287120 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561291740 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561294869 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561297501 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561300014 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561301420 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561305069 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561305892 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561306136 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561306502 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561307268 From azafari at openjdk.org Thu Apr 11 16:43:42 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 16:43:42 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 16:19:08 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed missing change. > > src/hotspot/os/linux/os_linux.cpp line 4684: > >> 4682: char* hint = (char*)(os::Linux::initial_thread_stack_bottom() - >> 4683: (StackOverflow::stack_guard_zone_size() + page_size)); >> 4684: char* codebuf = os::attempt_reserve_memory_at(hint, page_size, false, mtInternal); > > Should these be `mtInternal` or is there a `mtStack` that is more suitable? In line 4699, a few lines later, the original developer used `mtInternal`. I copied it here too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561315756 From ascarpino at openjdk.org Thu Apr 11 17:17:44 2024 From: ascarpino at openjdk.org (Anthony Scarpino) Date: Thu, 11 Apr 2024 17:17:44 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 18:02:38 GMT, Volodymyr Paprotski wrote: > > In `ECOperations.java`, if I understand this correctly, it is to replace the existing `PointMultiplier` with montgomery-based PointMuliplier. But when I look at the code, I see both are still options. If I read this correctly, it checks for the old `IntegerFieldModuloP`, then looks for the new `IntegerMontgomeryFieldModuloP`. It appears to use the new one always. Why doesn't it just replace the old implementation entry in the `fields` Map? Is there a reason to keep it around? > > Hmm, thats a good point I haven't fully considered; i.e. (if I read correctly) "for `CurveDB.P_256` remove the fallback path to non-montgomery entirely".. that might also help in cleaning a few things up in the construction. Maybe even get rid of this nested ECOperations inside ECOperations.. Perhaps nesting isnt a big deal, but all attempts to make the ECC stack clearer is positive! > > One functional reason that might justify keeping it as-is, is fuzz-testing; with the fallback available, I am able to write the included Fuzz tests and have them check the values against the existing implementation. While I also included a few KAT tests using openssl-generated values, the fuzz tests check millions of values and it does add a lot more certainty about correctness of this code. I hadn't looked at your fuzz test until you mentioned it. I see you are using reflection to change the values. Is that what you mean by "fallback"? I'm assuming there is no to access the older implementation without reflection. > > Can it be removed? For the operations that do not involve multiplication (i.e. `setSum(*)`), montgomery is expensive. I think I did go through the uses of this code some time back (i.e. ECDHE, ECDSA and KeyGeneration) and existing IntegerPolynomialP256 is no longer used (I should verify that again) and only P256OrderField remains non-montgomery. So removing references to IntegerPolynomialP256 in ECOperations should be possible and cleaner. Removing IntegerPolynomialP256 from MontgomeryIntegerPolynomialP256 is harder (fromMontgomery() uses IntegerPolynomialP256) but perhaps also worth some thought.. > > I tend to like `ECOperationsFuzzTest.java` and would prefer to keep it, but it could also be chucked up as part of 'scaffolding' and removed in name of code quality? I wouldn't rip out the old implementation. I have been wondering if we should make the older implementation available, maybe by security property. I was looking at the static Maps at the top of `ECOperations`, `forParameters`, and the constructors where it checks if the `montgomeryOps` was null or set. It would be nice if we could have one set of `fields` Maps by putting the montgomery entry into the `fields` to replace it. I think that should work because `IntegerMontgomeryFieldModuloP` extends `IntegerFieldModuloP`. `instanceof` or other `montgomeryOps` checks would still need to exist because not all the `fields` support mongomery, and the older implementation would still be accessible for your fuzz tester. At least that is my theory. > > Thanks @ascarpino > > PS: Perhaps there is some middle ground, remove the `ECOperations montgomeryOps` nesting, and construct (somehow?? singleton makes most things inaccessible..) the reference ECOperations in the fuzz test instead.. not sure how yet, but perhaps worth a further thought.. It would be nice to remove the nesting and it would be nice to be a singleton. Maybe some combination of what I mentioned above chance can help that. I haven't fully thought this out either. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2050148475 From vladimir.kozlov at oracle.com Thu Apr 11 17:22:28 2024 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 11 Apr 2024 10:22:28 -0700 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <8b4591db-b60b-42e2-a179-628ff7c29a24@oracle.com> Vote: yes Thanks, Vladimir K On 4/11/24 6:24 AM, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He > has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves > and repo consolidations over the years [1], but he was one of the > original authors of the arm64 port ([2] shows 359 changes in the > mercurial hotspot sub repository alone), contributed JEP 352 (support > for NVM devices under byte buffers), and more recently has been active > in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]https://github.com/openjdk/jdk/commits/master/?author=adinn > > [2]https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > > [3]https://openjdk.org/census#members > [4]https://openjdk.org/groups/#member-vote > > From kvn at openjdk.org Thu Apr 11 17:22:42 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 11 Apr 2024 17:22:42 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v3] In-Reply-To: <14wFn2FPMeokML2Uo-xi1YVB0wHJcw4KXDvGojouU9Q=.8b108c98-4967-4445-8012-c1eaa742a75b@github.com> References: <14wFn2FPMeokML2Uo-xi1YVB0wHJcw4KXDvGojouU9Q=.8b108c98-4967-4445-8012-c1eaa742a75b@github.com> Message-ID: <85TtcY-JoehYo9zA_htLF9NJKZ_Zt-ex0sFajYuxzU8=.9d1f0611-fa45-4341-ac88-258f5be0eb85@github.com> On Thu, 11 Apr 2024 11:01:10 GMT, Axel Boldt-Christmas wrote: >> `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. >> >> This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. >> >> It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. >> >> Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. >> >> Tests still running. Tier 1-7 done. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Include sort order Why lightweight locking verification code (old and new) is under `#ifndef PRODUCT` instead of `#ifdef ASSERT`? Asserts are enabled only in debug VM (when ASSERT is defined). ------------- PR Review: https://git.openjdk.org/jdk/pull/18715#pullrequestreview-1994760284 From azafari at openjdk.org Thu Apr 11 18:08:44 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 18:08:44 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 16:15:04 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed missing change. > > src/hotspot/os/bsd/gc/x/xPhysicalMemoryBacking_bsd.cpp line 81: > >> 79: >> 80: // Reserve address space for backing memory >> 81: _base = (uintptr_t)os::reserve_memory(max_capacity, false, mtJavaHeap); > > I think this should be using !MemExec and not a raw 'false' argument. > > With that said, I really think it would be best if we actually split os::reserve_memory into two distinct functions: > 1) os::reserve_memory(max_capacity, mtJavaHeap) // Non-executable memory > 2) os::reserve_executable_memory(max_capacity, mtCode) // Executable > > Maybe we could think about that in separate RFE. `false/true` constants are not used in executable args. separate reserve_memory functions can be left for another RFE. > src/hotspot/os/bsd/gc/z/zPhysicalMemoryBacking_bsd.cpp line 82: > >> 80: >> 81: // Reserve address space for backing memory >> 82: _base = (uintptr_t)os::reserve_memory(max_capacity, false, mtJavaHeap); > > Use !MemExec - this goes for all other places in the patch that fills in the `exec` parameter. Fixed. > src/hotspot/os/windows/os_windows.cpp line 3137: > >> 3135: // If reservation failed, return null >> 3136: if (p_buf == nullptr) return nullptr; >> 3137: MemTracker::record_virtual_memory_reserve((address)p_buf, size_of_reserve, CALLER_PC, mtNone); > > Why is this (and the other places in this file) using `mtNone`? Shouldn't it at least be using `mtInternal`? Fixed. > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1414: > >> 1412: assert(SafepointSynchronize::is_at_safepoint(), "safe iteration is only available during safepoints"); >> 1413: >> 1414: if (!_aux_bitmap_region_special && !os::commit_memory((char*)_aux_bitmap_region.start(), _aux_bitmap_region.byte_size(), false, mtJavaHeap)) { > > Why is this `mtJavaHeap` and not `mtGC`? Fixed. > src/hotspot/share/gc/z/zMarkStackAllocator.cpp line 107: > >> 105: >> 106: const uintptr_t shrink_start = _end - shrink_size; >> 107: os::uncommit_memory((char*)shrink_start, shrink_size, mtGC, false /* executable */); > > `uncommit_memory` places the order of executable and flags differently to what we have for `commmit_memory_or_exit`. We might want to consider doing something about the order here. Fixed. > src/hotspot/share/memory/virtualspace.cpp line 71: > >> 69: ReservedSpace::ReservedSpace(size_t size, >> 70: size_t alignment, >> 71: size_t page_size, MEMFLAGS flag, > > I think this function was written to have one argument per line. You should probably keep the style. > > I'm also unsure why this param is put as the next to last param instead of the last, as we do in many other places. Put in separate line. the last param is optional, and flag is to be mandatory. > src/hotspot/share/memory/virtualspace.cpp line 693: > >> 691: _special = false; >> 692: _executable = false; >> 693: _nmt_flag = mtNone; > > Weird indentation. Fixed. > src/hotspot/share/memory/virtualspace.hpp line 45: > >> 43: bool _special; >> 44: int _fd_for_heap; >> 45: MEMFLAGS _nmt_flag; > > Indentation is now off. Fixed. > src/hotspot/share/memory/virtualspace.hpp line 72: > >> 70: >> 71: inline MEMFLAGS nmt_flag() { return _nmt_flag; } >> 72: inline void set_nmt_flag(MEMFLAGS flag) { _nmt_flag = flag; } > > No need for the inline specifier here. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561408632 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561409363 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561410373 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561412485 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561413236 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561418594 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561421020 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561423257 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561424255 From jsjolen at openjdk.org Thu Apr 11 18:08:44 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 11 Apr 2024 18:08:44 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 17:58:37 GMT, Afshin Zafari wrote: >> src/hotspot/os/bsd/gc/x/xPhysicalMemoryBacking_bsd.cpp line 81: >> >>> 79: >>> 80: // Reserve address space for backing memory >>> 81: _base = (uintptr_t)os::reserve_memory(max_capacity, false, mtJavaHeap); >> >> I think this should be using !MemExec and not a raw 'false' argument. >> >> With that said, I really think it would be best if we actually split os::reserve_memory into two distinct functions: >> 1) os::reserve_memory(max_capacity, mtJavaHeap) // Non-executable memory >> 2) os::reserve_executable_memory(max_capacity, mtCode) // Executable >> >> Maybe we could think about that in separate RFE. > > `false/true` constants are not used in executable args. > separate reserve_memory functions can be left for another RFE. The executable argument really is only false in the original, can we keep this from doing any functional changes here and keep that to separate PR:s? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561415984 From aboldtch at openjdk.org Thu Apr 11 18:12:41 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Apr 2024 18:12:41 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v3] In-Reply-To: <85TtcY-JoehYo9zA_htLF9NJKZ_Zt-ex0sFajYuxzU8=.9d1f0611-fa45-4341-ac88-258f5be0eb85@github.com> References: <14wFn2FPMeokML2Uo-xi1YVB0wHJcw4KXDvGojouU9Q=.8b108c98-4967-4445-8012-c1eaa742a75b@github.com> <85TtcY-JoehYo9zA_htLF9NJKZ_Zt-ex0sFajYuxzU8=.9d1f0611-fa45-4341-ac88-258f5be0eb85@github.com> Message-ID: On Thu, 11 Apr 2024 17:19:56 GMT, Vladimir Kozlov wrote: > Why lightweight locking verification code (old and new) is under `#ifndef PRODUCT` instead of `#ifdef ASSERT`? Asserts are enabled only in debug VM (when ASSERT is defined). I think it was mostly convenience and an oversight from my part. Did not think about the exact interactions. But it does look strange to call a function that does nothing but spin around in a loop for some cycles. I will separate out the new verification code put it behind `ASSERT` instead. As for the old LockStack verification, I do not know. I was not involved with this back when it was integrated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18715#issuecomment-2050241132 From sgibbons at openjdk.org Thu Apr 11 18:17:01 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 18:17:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v9] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <-8Y7DVQtgJh8lec7MbHRv1jYx3VhURh0n5mpwgs6eSw=.13972572-1542-4280-a2ea-acf3bdc77352@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Set memory test (#16) Add framework for other platforms. Moved fill_to_memory_atomic back to the .cpp from the .hpp in order to get 32-bit fixed. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/b0ac8577..95230e29 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=07-08 Stats: 179 lines in 14 files changed: 115 ins; 49 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From jsjolen at openjdk.org Thu Apr 11 18:18:45 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 11 Apr 2024 18:18:45 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v3] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 16:21:52 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > fixed shenandoah missed changes. Hi Afshin, Thank you for this! I found a couple of things. @tstuefe , would you mind having a look at the Metaspace changes? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2267: > 2265: char* start = (char*) _bitmap_region.start() + off; > 2266: > 2267: if (!os::commit_memory(start, len, false)) { I think this should probably be `mtGC`. @shipilev, you don't happen to know whether this should be accounted under the Java heap? Thank you. src/hotspot/share/nmt/virtualMemoryTracker.cpp line 460: > 458: assert(_reserved_regions != nullptr, "Sanity check"); > 459: > 460: ReservedMemoryRegion rgn(addr, size, NativeCallStack::empty_stack(), flag); Instead, change the constructor so that it takes a flag? ```c++ ReservedMemoryRegion(address base, size_t size, MEMFLAGS flag) : VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(flag) { } Or does that break somewhere else? ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-1994902004 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561430517 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561443098 From aboldtch at openjdk.org Thu Apr 11 18:22:08 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 11 Apr 2024 18:22:08 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v4] In-Reply-To: References: Message-ID: > `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. > > This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. > > It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. > > Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. > > Tests still running. Tier 1-7 done. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Change to ASSERT ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18715/files - new: https://git.openjdk.org/jdk/pull/18715/files/12d112bc..077b62af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18715&range=02-03 Stats: 10 lines in 3 files changed: 5 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18715.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18715/head:pull/18715 PR: https://git.openjdk.org/jdk/pull/18715 From shade at openjdk.org Thu Apr 11 18:25:42 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 Apr 2024 18:25:42 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v3] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 18:09:08 GMT, Johan Sj?len wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed shenandoah missed changes. > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2267: > >> 2265: char* start = (char*) _bitmap_region.start() + off; >> 2266: >> 2267: if (!os::commit_memory(start, len, false)) { > > I think this should probably be `mtGC`. @shipilev, you don't happen to know whether this should be accounted under the Java heap? Thank you. This is bitmap slice, so it is not Java heap, it is `mtGC`. In the sister method, `uncommit_bitmap_slice`, we do the right thing: `mtGC`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561459688 From sgibbons at openjdk.org Thu Apr 11 18:26:58 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 18:26:58 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v10] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Merge master - Set memory test (#16) Add framework for other platforms. Moved fill_to_memory_atomic back to the .cpp from the .hpp in order to get 32-bit fixed. - Address review comments (#15) * Address review comments - Add movq to locate_operand - Oops - Fixed generate_fill when count > 0x80000000 - Fix Windows - Addressing review comments. - Remove dead code - Use non-sse fill (old left in) - ... and 9 more: https://git.openjdk.org/jdk/compare/31ee5108...41ffcc32 ------------- Changes: https://git.openjdk.org/jdk/pull/18555/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=09 Stats: 734 lines in 38 files changed: 680 ins; 5 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Thu Apr 11 18:26:58 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 18:26:58 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v9] In-Reply-To: <-8Y7DVQtgJh8lec7MbHRv1jYx3VhURh0n5mpwgs6eSw=.13972572-1542-4280-a2ea-acf3bdc77352@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <-8Y7DVQtgJh8lec7MbHRv1jYx3VhURh0n5mpwgs6eSw=.13972572-1542-4280-a2ea-acf3bdc77352@github.com> Message-ID: <2VYJw_lhaOBzrrWVPTtbFwdXQbcVzpqlFrY2dYJErSE=.2440284f-286b-4e23-bb3c-f2f2ae745f84@github.com> On Thu, 11 Apr 2024 18:17:01 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Set memory test (#16) > > Add framework for other platforms. Moved fill_to_memory_atomic back to the .cpp from the .hpp in order to get 32-bit fixed. I added the framework for setting unsafe access marks within all platforms, and fixed a bug with the Linux 32-bit runtime tests. Adding stub intrinsic for setMemory0 for other platforms should be easier now. Passes CI testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2050267842 From sgibbons at openjdk.org Thu Apr 11 18:42:56 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 18:42:56 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v11] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <7yXL0LWE8zWff1vL2JL9exL6IaHrD1yKsyTLGzJg4Eo=.15b0f724-54e5-4ab7-8c22-b35766eb0bca@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix whitespace error. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/41ffcc32..b99499a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From richard.reingruber at sap.com Thu Apr 11 19:10:58 2024 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 11 Apr 2024 19:10:58 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes Richard. From: hotspot-dev on behalf of Thomas Stuefe Date: Thursday, 11. April 2024 at 15:25 To: hotspot-dev at openjdk.org Subject: CFV: New HotSpot Group Member: Andrew Dinn Hi, I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. Votes are due by April 25, 2024. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Cheers, Thomas [1] https://github.com/openjdk/jdk/commits/master/?author=adinn [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot [3] https://openjdk.org/census#members [4] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Thu Apr 11 19:15:51 2024 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 11 Apr 2024 12:15:51 -0700 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <96fe90c5-2b0b-416c-976d-ac686556e6ed@oracle.com> Vote: yes From kvn at openjdk.org Thu Apr 11 19:21:42 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 11 Apr 2024 19:21:42 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v4] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 18:22:08 GMT, Axel Boldt-Christmas wrote: >> `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. >> >> This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. >> >> It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. >> >> Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. >> >> Tests still running. Tier 1-7 done. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Change to ASSERT Good.. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18715#pullrequestreview-1995090696 From aph at openjdk.org Thu Apr 11 19:51:11 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 11 Apr 2024 19:51:11 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v12] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/9b98662a..1518b028 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=10-11 Stats: 965 lines in 6 files changed: 622 ins; 336 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From vladimir.x.ivanov at oracle.com Thu Apr 11 20:03:08 2024 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 11 Apr 2024 13:03:08 -0700 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes Best regards, Vladimir Ivanov On 4/11/24 06:24, Thomas Stuefe wrote: > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. From azafari at openjdk.org Thu Apr 11 20:34:42 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 20:34:42 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v3] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 18:21:58 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 2267: >> >>> 2265: char* start = (char*) _bitmap_region.start() + off; >>> 2266: >>> 2267: if (!os::commit_memory(start, len, false)) { >> >> I think this should probably be `mtGC`. @shipilev, you don't happen to know whether this should be accounted under the Java heap? Thank you. > > This is bitmap slice, so it is not Java heap, it is `mtGC`. In the sister method, `uncommit_bitmap_slice`, we do the right thing: `mtGC`. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561641931 From azafari at openjdk.org Thu Apr 11 20:34:44 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 20:34:44 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 16:27:41 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed missing change. > > src/hotspot/share/memory/metaspace.cpp line 592: > >> 590: // Fallback: reserve anywhere >> 591: log_debug(metaspace, map)("Trying anywhere..."); >> 592: result = os::reserve_memory_aligned(size, Metaspace::reserve_alignment(), false, mtMetaspace); > > It's unclear to me if some of these `mtMetaspace` should be `mtClass`. This comment applies to other places where we're setting up memory for the compressed class space. Anywhere compressed class is used, the flag is set to `mtClass`. > src/hotspot/share/memory/virtualspace.cpp line 366: > >> 364: ReservedSpace space; >> 365: space.initialize_members(base, size, alignment, page_size, special, executable); >> 366: space.set_nmt_flag(flag); > > Why is this calling a set_nmt_flag instead of making initialize_member take a flag? Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561644315 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561645352 From azafari at openjdk.org Thu Apr 11 20:40:44 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 20:40:44 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v3] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <5xywkAFrVzkc3ZPUaxxUf4vn4uEpqvppSDHPdjnnzbY=.cb922cee-aef8-49b5-9490-1315da1299c0@github.com> On Thu, 11 Apr 2024 18:13:55 GMT, Johan Sj?len wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed shenandoah missed changes. > > src/hotspot/share/nmt/virtualMemoryTracker.cpp line 460: > >> 458: assert(_reserved_regions != nullptr, "Sanity check"); >> 459: >> 460: ReservedMemoryRegion rgn(addr, size, NativeCallStack::empty_stack(), flag); > > Instead, change the constructor so that it takes a flag? > > ```c++ > ReservedMemoryRegion(address base, size_t size, MEMFLAGS flag) : > VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(flag) { } > > > Or does that break somewhere else? Fixed. No problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1561653418 From sviswanathan at openjdk.org Thu Apr 11 20:40:46 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 11 Apr 2024 20:40:46 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v11] In-Reply-To: <7yXL0LWE8zWff1vL2JL9exL6IaHrD1yKsyTLGzJg4Eo=.15b0f724-54e5-4ab7-8c22-b35766eb0bca@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <7yXL0LWE8zWff1vL2JL9exL6IaHrD1yKsyTLGzJg4Eo=.15b0f724-54e5-4ab7-8c22-b35766eb0bca@github.com> Message-ID: On Thu, 11 Apr 2024 18:42:56 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix whitespace error. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8343: > 8341: UnsafeCopyMemory::create_table(8); > 8342: } > 8343: Did you mean to initialize UnsafeSetMemory::_table here instead? src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 155: > 153: StubRoutines::_arrayof_jint_fill = generate_fill(T_INT, true, "arrayof_jint_fill"); > 154: > 155: // #ifdef _LP64 We could remove the #ifdef _LP64, #endif commented pair. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 735: > 733: > 734: if (MaxVectorSize == 64) { > 735: UnsafeCopyMemoryMark ucmm(this, !is_oop && !aligned, false, ucme_exit_pc); This is not related to Unsafe::setMemory? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561587296 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561606554 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561620598 From sgibbons at openjdk.org Thu Apr 11 21:00:45 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 21:00:45 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v11] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <7yXL0LWE8zWff1vL2JL9exL6IaHrD1yKsyTLGzJg4Eo=.15b0f724-54e5-4ab7-8c22-b35766eb0bca@github.com> Message-ID: On Thu, 11 Apr 2024 20:08:18 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix whitespace error. > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8343: > >> 8341: UnsafeCopyMemory::create_table(8); >> 8342: } >> 8343: > > Did you mean to initialize UnsafeSetMemory::_table here instead? Yes. Good catch. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 155: > >> 153: StubRoutines::_arrayof_jint_fill = generate_fill(T_INT, true, "arrayof_jint_fill"); >> 154: >> 155: // #ifdef _LP64 > > We could remove the #ifdef _LP64, #endif commented pair. Done. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 735: > >> 733: >> 734: if (MaxVectorSize == 64) { >> 735: UnsafeCopyMemoryMark ucmm(this, !is_oop && !aligned, false, ucme_exit_pc); > > This is not related to Unsafe::setMemory? No. Reviewing the code I saw this as a potential error, as `arraycopy_avx3_large` could cause a SIGBUS which wouldn't be caught. It conforms to the other instances of copy in the code. I think it was missed by the original developer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561687577 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561688018 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561695561 From iklam at openjdk.org Thu Apr 11 21:17:43 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 11 Apr 2024 21:17:43 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 16:31:08 GMT, Ioi Lam wrote: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. > Thinking about this, since global entropy (archived object ihashes) sneak into archives whether we use local seeds or not, maybe we should not bother with such a patch. > > In other words, if global state affects the archive anyway, we may just as well roll with it. > > See #18735 In CDS, we intend to be as much independent of the global JVM state as possible. For example, since [JDK-8296344](https://bugs.openjdk.org/browse/JDK-8296344), we no longer make a copy of the the archived heap objects in the actual Java heap. The intention of this PR is the same -- the contents of archived Symbols should not depend on the value of the os::random() seed. In #18735 you found that some other contents of the CDS archive depend on the JVM's os::random() seed. That may be something we want to fix separately. In any case, that's not a reason to not proceed with this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18728#issuecomment-2050564497 From azafari at openjdk.org Thu Apr 11 21:25:55 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 21:25:55 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: review comments applied. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/b009556e..9d66735f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=02-03 Stats: 122 lines in 35 files changed: 5 ins; 1 del; 116 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Thu Apr 11 21:25:55 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 11 Apr 2024 21:25:55 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 21:23:08 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > review comments applied. All comments applied. Ready for another round of reviews. ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-1995509280 From sgibbons at openjdk.org Thu Apr 11 21:47:01 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 11 Apr 2024 21:47:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v12] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Addressing more review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/b99499a9..89db3eb6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=10-11 Stats: 250 lines in 2 files changed: 85 ins; 97 del; 68 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sspitsyn at openjdk.org Thu Apr 11 22:19:41 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 11 Apr 2024 22:19:41 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> References: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> Message-ID: On Thu, 11 Apr 2024 16:22:44 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: refactored to get rid of overloaded doit functions > > src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 307: > >> 305: if (!JvmtiEnvBase::is_vthread_alive(target_h())) { >> 306: return; // _completed remains false. >> 307: } > > Do we need this? We already do this check in JvmtiHandshake::execute(). Good suggestion, thanks. Will remove it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561812235 From sspitsyn at openjdk.org Thu Apr 11 22:27:42 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 11 Apr 2024 22:27:42 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> References: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> Message-ID: On Thu, 11 Apr 2024 16:25:30 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: refactored to get rid of overloaded doit functions > > src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 309: > >> 307: } >> 308: ResourceMark rm; >> 309: javaVFrame *jvf = JvmtiEnvBase::get_vthread_jvf(target_h()); > > This method already handles both mounted and unmounted case, so do we need the first conditional above? Good suggestion, thanks. I was also thinking about it but decided to avoid the risk because of this check in `do_thread()`: if (!jt->is_exiting() && jt->has_last_Java_frame()) { It can be it is not important to check or I can add an assert for this condition. Let me try and test it first. This kind of simplification looks as important. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561816846 From sspitsyn at openjdk.org Thu Apr 11 22:56:42 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 11 Apr 2024 22:56:42 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> References: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> Message-ID: On Thu, 11 Apr 2024 16:26:51 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: refactored to get rid of overloaded doit functions > > src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 367: > >> 365: GetCurrentLocationClosure op; >> 366: JvmtiHandshake::execute(&op, &tlh, thread, thread_h); >> 367: > > Seems we are missing a JvmtiVTMSTransitionDisabler. Good question, thanks. The `JvmtiVTMSTransitionDisabler` is supposed to be installed in the caller's context if needed. However, it is not easy to make sure it is always the case. At least, I see a couple of contexts when the `JvmtiVTMSTransitionDisabler` is not being installed. But it is not clear if it is really needed there. Let me do some extra analysis there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561834120 From sviswanathan at openjdk.org Thu Apr 11 23:34:43 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 11 Apr 2024 23:34:43 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v11] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <7yXL0LWE8zWff1vL2JL9exL6IaHrD1yKsyTLGzJg4Eo=.15b0f724-54e5-4ab7-8c22-b35766eb0bca@github.com> Message-ID: On Thu, 11 Apr 2024 20:58:00 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 735: >> >>> 733: >>> 734: if (MaxVectorSize == 64) { >>> 735: UnsafeCopyMemoryMark ucmm(this, !is_oop && !aligned, false, ucme_exit_pc); >> >> This is not related to Unsafe::setMemory? > > No. Reviewing the code I saw this as a potential error, as `arraycopy_avx3_large` could cause a SIGBUS which wouldn't be caught. It conforms to the other instances of copy in the code. I think it was missed by the original developer. Would be good to do it in a separate PR then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561801825 From sviswanathan at openjdk.org Thu Apr 11 23:34:46 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 11 Apr 2024 23:34:46 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v12] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Thu, 11 Apr 2024 21:47:01 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Addressing more review comments src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2751: > 2749: UnsafeSetMemoryMark usmm(this, true, true); > 2750: > 2751: __ generate_fill(T_BYTE, false, c_rarg0, c_rarg1, r11, rax, xmm0); We will be duplicating the code gen for generate_fill here? Could we not do a tail call to _jbyte_fill here and add UnsafeSetMemoryMark inside _jbyte_fill? src/hotspot/share/opto/library_call.cpp line 4952: > 4950: } > 4951: > 4952: bool LibraryCallKit::inline_unsafe_setMemory() { It will be good to add the signature of Unsafe.setMemory0 as a comment above line 4952. src/hotspot/share/opto/runtime.cpp line 783: > 781: fields[argp++] = TypeLong::LONG; // size > 782: fields[argp++] = Type::HALF; // size > 783: fields[argp++] = TypeInt::INT; // bytevalue Should this be TypeInt::BYTE? src/hotspot/share/runtime/sharedRuntime.cpp line 181: > 179: > 180: uint SharedRuntime::_unsafe_set_memory_ctr=0; > 181: Extra blank line before line 180 could be removed. src/hotspot/share/runtime/sharedRuntime.cpp line 1994: > 1992: if (_rethrow_ctr) tty->print_cr("%5u rethrow handler", _rethrow_ctr); > 1993: > 1994: if (_unsafe_set_memory_ctr) tty->print_cr("%5u unsafe set memorys", _unsafe_set_memory_ctr); Extra blank line before line 1994 could be removed. src/hotspot/share/runtime/sharedRuntime.hpp line 546: > 544: > 545: static uint _unsafe_set_memory_ctr; // Slow-path includes alignment checks > 546: Extra blank line before line 545 could be removed. test/jdk/sun/misc/CopyMemory.java line 214: > 212: random.setSeed(seed); > 213: System.out.println("Seed set to "+ seed); > 214: Looks like these lines were added for debugging, could be removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561853120 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561831702 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561820596 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561822279 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561822556 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561823861 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561828829 From sgibbons at openjdk.org Fri Apr 12 00:07:56 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 00:07:56 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v13] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <2W-3EZqHS-07qzZ4RS72u33Hav0LMRfeIG4QPAyvk10=.8a35e043-6803-42d5-8ea0-bff5378a8c50@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Addressing yet more review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/89db3eb6..970c5751 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=11-12 Stats: 21 lines in 6 files changed: 6 ins; 11 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Fri Apr 12 00:07:57 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 00:07:57 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v11] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <7yXL0LWE8zWff1vL2JL9exL6IaHrD1yKsyTLGzJg4Eo=.15b0f724-54e5-4ab7-8c22-b35766eb0bca@github.com> Message-ID: On Thu, 11 Apr 2024 22:00:01 GMT, Sandhya Viswanathan wrote: >> No. Reviewing the code I saw this as a potential error, as `arraycopy_avx3_large` could cause a SIGBUS which wouldn't be caught. It conforms to the other instances of copy in the code. I think it was missed by the original developer. > > Would be good to do it in a separate PR then. Removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561866645 From sgibbons at openjdk.org Fri Apr 12 00:07:58 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 00:07:58 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v12] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Thu, 11 Apr 2024 23:30:07 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Addressing more review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2751: > >> 2749: UnsafeSetMemoryMark usmm(this, true, true); >> 2750: >> 2751: __ generate_fill(T_BYTE, false, c_rarg0, c_rarg1, r11, rax, xmm0); > > We will be duplicating the code gen for generate_fill here? Could we not do a tail call to _jbyte_fill here and add UnsafeSetMemoryMark inside _jbyte_fill? It would not be appropriate to add set memory marks to the existing _jbyte_fill as it is being used by other routines, and the effect of the mark will be very hard to track down (if any). Are you *sure* we want to do that? > src/hotspot/share/opto/library_call.cpp line 4952: > >> 4950: } >> 4951: >> 4952: bool LibraryCallKit::inline_unsafe_setMemory() { > > It will be good to add the signature of Unsafe.setMemory0 as a comment above line 4952. Done > src/hotspot/share/opto/runtime.cpp line 783: > >> 781: fields[argp++] = TypeLong::LONG; // size >> 782: fields[argp++] = Type::HALF; // size >> 783: fields[argp++] = TypeInt::INT; // bytevalue > > Should this be TypeInt::BYTE? Should be TypeInt::UBYTE. Changed. > src/hotspot/share/runtime/sharedRuntime.cpp line 181: > >> 179: >> 180: uint SharedRuntime::_unsafe_set_memory_ctr=0; >> 181: > > Extra blank line before line 180 could be removed. Done > src/hotspot/share/runtime/sharedRuntime.cpp line 1994: > >> 1992: if (_rethrow_ctr) tty->print_cr("%5u rethrow handler", _rethrow_ctr); >> 1993: >> 1994: if (_unsafe_set_memory_ctr) tty->print_cr("%5u unsafe set memorys", _unsafe_set_memory_ctr); > > Extra blank line before line 1994 could be removed. Done. > src/hotspot/share/runtime/sharedRuntime.hpp line 546: > >> 544: >> 545: static uint _unsafe_set_memory_ctr; // Slow-path includes alignment checks >> 546: > > Extra blank line before line 545 could be removed. Done. > test/jdk/sun/misc/CopyMemory.java line 214: > >> 212: random.setSeed(seed); >> 213: System.out.println("Seed set to "+ seed); >> 214: > > Looks like these lines were added for debugging, could be removed. Yes, but I believe we should adopt this for the future since reproducing random test failures is extremely difficult without knowing the seed of the RNG. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561867359 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561867575 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561867857 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561867923 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561868084 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561868158 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561868630 From sgibbons at openjdk.org Fri Apr 12 00:07:58 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 00:07:58 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v12] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 12 Apr 2024 00:03:10 GMT, Scott Gibbons wrote: >> test/jdk/sun/misc/CopyMemory.java line 214: >> >>> 212: random.setSeed(seed); >>> 213: System.out.println("Seed set to "+ seed); >>> 214: >> >> Looks like these lines were added for debugging, could be removed. > > Yes, but I believe we should adopt this for the future since reproducing random test failures is extremely difficult without knowing the seed of the RNG. Removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561868727 From sviswanathan at openjdk.org Fri Apr 12 00:28:50 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 12 Apr 2024 00:28:50 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v13] In-Reply-To: <2W-3EZqHS-07qzZ4RS72u33Hav0LMRfeIG4QPAyvk10=.8a35e043-6803-42d5-8ea0-bff5378a8c50@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <2W-3EZqHS-07qzZ4RS72u33Hav0LMRfeIG4QPAyvk10=.8a35e043-6803-42d5-8ea0-bff5378a8c50@github.com> Message-ID: On Fri, 12 Apr 2024 00:07:56 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Addressing yet more review comments src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2504: > 2502: Label L_exit, L_fillQuadwords, L_fillDwords, L_fillBytes; > 2503: > 2504: setup_arg_regs(3); A comment stating the placement of dest, size, and byteVal after call to setup_arg_regs() would be very helpful. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2521: > 2519: const Register byteVal = rdx; > 2520: > 2521: // Propagate byte to full Register The comment refers to lines 2524-2526, please move it down. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561873770 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561871872 From sviswanathan at openjdk.org Fri Apr 12 00:28:50 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 12 Apr 2024 00:28:50 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v13] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <2W-3EZqHS-07qzZ4RS72u33Hav0LMRfeIG4QPAyvk10=.8a35e043-6803-42d5-8ea0-bff5378a8c50@github.com> Message-ID: On Fri, 12 Apr 2024 00:10:22 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Addressing yet more review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2521: > >> 2519: const Register byteVal = rdx; >> 2520: >> 2521: // Propagate byte to full Register > > The comment refers to lines 2524-2526, please move it down. Still continuing to look through StubGenerator::generate_unsafe_setmemory(), more comments to come. Thank you for your patience. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561878666 From kim.barrett at oracle.com Fri Apr 12 00:29:22 2024 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 12 Apr 2024 00:29:22 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <6704C583-30A7-4251-A6C7-AF89E9504B1C@oracle.com> vote: yes > On Apr 11, 2024, at 9:24 AM, Thomas Stuefe wrote: > > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1] https://github.com/openjdk/jdk/commits/master/?author=adinn > [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > [3] https://openjdk.org/census#members > [4] https://openjdk.org/groups/#member-vote > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From sviswanathan at openjdk.org Fri Apr 12 00:28:51 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 12 Apr 2024 00:28:51 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v12] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 12 Apr 2024 00:00:38 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2751: >> >>> 2749: UnsafeSetMemoryMark usmm(this, true, true); >>> 2750: >>> 2751: __ generate_fill(T_BYTE, false, c_rarg0, c_rarg1, r11, rax, xmm0); >> >> We will be duplicating the code gen for generate_fill here? Could we not do a tail call to _jbyte_fill here and add UnsafeSetMemoryMark inside _jbyte_fill? > > It would not be appropriate to add set memory marks to the existing _jbyte_fill as it is being used by other routines, and the effect of the mark will be very hard to track down (if any). > > Are you *sure* we want to do that? Yes we want to do that. It is all guarded by thread->doing_unsafe_access() which is only true when we are getting to this code from unsafe. Similar technique is used in copyMemory as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1561877858 From yangfei at iscas.ac.cn Fri Apr 12 00:34:01 2024 From: yangfei at iscas.ac.cn (yangfei at iscas.ac.cn) Date: Fri, 12 Apr 2024 08:34:01 +0800 (GMT+08:00) Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <5ff63c90.4045.18ecfb9c349.Coremail.yangfei@iscas.ac.cn> Vote: yes -----Original Messages----- From:"Thomas Stuefe" Sent Time:2024-04-11 21:24:18 (Thursday) To: hotspot-dev at openjdk.org Cc: Subject: CFV: New HotSpot Group Member: Andrew Dinn Hi, I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. Votes are due by April 25, 2024. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Cheers, Thomas [1]https://github.com/openjdk/jdk/commits/master/?author=adinn [2]https://hg.openjdk.org/aarch64-port/jdk7u/hotspot [3]https://openjdk.org/census#members [4]https://openjdk.org/groups/#member-vote From sspitsyn at openjdk.org Fri Apr 12 01:24:52 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Apr 2024 01:24:52 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: References: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> Message-ID: <6hM-85WUal6tHr5lcHP021vbHFdWG-RAeJrRv-dxBw0=.914dc93d-bea2-40f0-b853-b5b8c8009c41@github.com> On Thu, 11 Apr 2024 22:54:16 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 367: >> >>> 365: GetCurrentLocationClosure op; >>> 366: JvmtiHandshake::execute(&op, &tlh, thread, thread_h); >>> 367: >> >> Seems we are missing a JvmtiVTMSTransitionDisabler. > > Good question, thanks. > The `JvmtiVTMSTransitionDisabler` is supposed to be installed in the caller's context if needed. > However, it is not easy to make sure it is always the case. > At least, I see a couple of contexts when the `JvmtiVTMSTransitionDisabler` is not being installed. > But it is not clear if it is really needed there. Let me do some extra analysis there. Okay. The class `GetCurrentLocationClosure` is used by the `reset_current_location` only. It is called for the SINGLE_STEP and REAKPOINT event types as the following assert is placed at the function start: void JvmtiEnvThreadState::reset_current_location(jvmtiEvent event_type, bool enabled) { assert(event_type == JVMTI_EVENT_SINGLE_STEP || event_type == JVMTI_EVENT_BREAKPOINT, "must be single-step or breakpoint event"); . . . Also, this is the only two places where this function is called: JvmtiEventControllerPrivate::recompute_env_thread_enabled(JvmtiEnvThreadState* ets, JvmtiThreadState* state) { . . . if (changed & SINGLE_STEP_BIT) { ets->reset_current_location(JVMTI_EVENT_SINGLE_STEP, (now_enabled & SINGLE_STEP_BIT) != 0); } if (changed & BREAKPOINT_BIT) { ets->reset_current_location(JVMTI_EVENT_BREAKPOINT, (now_enabled & BREAKPOINT_BIT) != 0); } The `reset_current_location` is called called in the context of the `SetEventNotificationMode` where a JvmtiVTMSTransitionDisabler is present. Theoretically, it can be also triggered by the `SetEventCallbacks` (if callbacks are for SINGLE_STEP or REAKPOINT event type). But it also has a J`vmtiVTMSTransitionDisabler` in place: JvmtiEnv::SetEventCallbacks(const jvmtiEventCallbacks* callbacks, jint size_of_callbacks) { JvmtiVTMSTransitionDisabler disabler; JvmtiEventController::set_event_callbacks(this, callbacks, size_of_callbacks); return JVMTI_ERROR_NONE; } /* end SetEventCallbacks */ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1561903953 From vlivanov at openjdk.org Fri Apr 12 02:26:44 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 12 Apr 2024 02:26:44 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v12] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 19:51:11 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Testing results (hs-tier1 - hs-tier6) are clean (w/ `-XX:-InlineSecondarySupersTest` and `-XX:+InlineSecondarySupersTest`). There's one build failure in GHA (minimal VM build on linux-x64) because InlineSecondarySupersTest is C2-only flag. Also, since the stubs are for compiler usage, any particular reason to generate them in `StubGenerator::generate_final_stubs()`? `StubGenerator::generate_compiler_stubs()` looks like a better fit for the job. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2050850776 From jzhu at openjdk.org Fri Apr 12 02:31:43 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Fri, 12 Apr 2024 02:31:43 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 14:54:09 GMT, Erik ?sterlund wrote: >> Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: >> >> Add more output for easy debugging once the jtreg test case fails > > This looks good to me and seems to follow a similar design to what I did on x86_64 vectors. Thanks for doing this! Thanks a lot for the review! @fisk ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-2050854420 From gli at openjdk.org Fri Apr 12 03:39:43 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 12 Apr 2024 03:39:43 GMT Subject: RFR: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. Looks good. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18696#pullrequestreview-1995810344 From gli at openjdk.org Fri Apr 12 03:50:43 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 12 Apr 2024 03:50:43 GMT Subject: RFR: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. The `CardTable::invalidate` can dirty a memory region cross generational boundary. But actually, we always only want to dirty the regions of one of the two generations instead of crossing them. So it is good to remove `CardTable::invalidate`. Is my understanding above right? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18696#issuecomment-2050915367 From duke at openjdk.org Fri Apr 12 05:45:12 2024 From: duke at openjdk.org (kuaiwei) Date: Fri, 12 Apr 2024 05:45:12 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v6] In-Reply-To: References: Message-ID: <5pZbMyH0BDsJHyJe5JAfM111ZUf9u80YQCbc0n_aKWg=.733e6302-1abb-4866-9409-86526dd144c4@github.com> > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Cleanup unused _last_label_code - Simplify code - Fix cross build error - Move fsm to CodeBuffer - Add fsm for merging - 8328876: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier ------------- Changes: https://git.openjdk.org/jdk/pull/18467/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=05 Stats: 625 lines in 21 files changed: 590 ins; 6 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From stuefe at openjdk.org Fri Apr 12 05:55:41 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Apr 2024 05:55:41 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 21:14:39 GMT, Ioi Lam wrote: > > Thinking about this, since global entropy (archived object ihashes) sneak into archives whether we use local seeds or not, maybe we should not bother with such a patch. > > In other words, if global state affects the archive anyway, we may just as well roll with it. > > See #18735 > > In CDS, we intend to be as much independent of the global JVM state as possible. For example, since [JDK-8296344](https://bugs.openjdk.org/browse/JDK-8296344), we no longer make a copy of the the archived heap objects in the actual Java heap. > > The intention of this PR is the same -- the contents of archived Symbols should not depend on the value of the os::random() seed. > > In #18735 you found that some other contents of the CDS archive depend on the JVM's os::random() seed. That may be something we want to fix separately. In any case, that's not a reason to not proceed with this PR. Okay. Thinking about this, an isolated seed is always better, since it provides safety against concurrent uses of os::random (which can happen even at initialization time). So my approval stands. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18728#issuecomment-2051028842 From aboldtch at openjdk.org Fri Apr 12 06:06:47 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 12 Apr 2024 06:06:47 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v4] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 18:22:08 GMT, Axel Boldt-Christmas wrote: >> `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. >> >> This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. >> >> It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. >> >> Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. >> >> Tests still running. Tier 1-7 done. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Change to ASSERT Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18715#issuecomment-2051039646 From aboldtch at openjdk.org Fri Apr 12 06:06:48 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 12 Apr 2024 06:06:48 GMT Subject: Integrated: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 12:11:17 GMT, Axel Boldt-Christmas wrote: > `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. > > This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. > > It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. > > Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. > > Tests still running. Tier 1-7 done. This pull request has now been integrated. Changeset: e45fea5a Author: Axel Boldt-Christmas URL: https://git.openjdk.org/jdk/commit/e45fea5a801ac09c3d572ac07d6179e80c422942 Stats: 149 lines in 4 files changed: 148 ins; 0 del; 1 mod 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack Reviewed-by: pchilanomate, kvn ------------- PR: https://git.openjdk.org/jdk/pull/18715 From mbaesken at openjdk.org Fri Apr 12 06:50:42 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 12 Apr 2024 06:50:42 GMT Subject: RFR: 8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v3] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Thu, 11 Apr 2024 16:00:05 GMT, Thomas Stuefe wrote: >My proposal would be either to drop these memory protection events (do we need them? or are they remnants of some old >support issues?) or to put them into a 'memprot' section or similar. A separate memprotect section would be good I can add this. I think usually these protection events appear around/after threads are added. I think we had some issues with these mem protections in the past so I would be cautious to completely remove them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2051097922 From stefan.karlsson at oracle.com Fri Apr 12 06:53:16 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 12 Apr 2024 08:53:16 +0200 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <3202d6ef-ef6f-45d5-97eb-833cf415d93f@oracle.com> Vote: yes StefanK On 2024-04-11 15:24, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. > He has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves > and repo consolidations over the years [1], but he was one of the > original authors of the arm64 port ([2] shows 359 changes in the > mercurial hotspot sub repository alone), contributed JEP 352 (support > for NVM devices under byte buffers), and more recently has been active > in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]https://github.com/openjdk/jdk/commits/master/?author=adinn > [2]https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > [3]https://openjdk.org/census#members > [4]https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From ayang at openjdk.org Fri Apr 12 07:00:42 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Apr 2024 07:00:42 GMT Subject: RFR: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: <84k6Vc2GkKCnsVLDHvg7wBRkjt8vMteTFTr_A_a60Lg=.51d860cc-393d-44d5-b2bb-69ecb4b6eeda@github.com> On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. True; the passed-in mem-region corresponds to an obj, which should never cross the gen-boundary. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18696#issuecomment-2051110756 From stefank at openjdk.org Fri Apr 12 07:04:43 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Apr 2024 07:04:43 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 18:02:05 GMT, Johan Sj?len wrote: >> `false/true` constants are not used in executable args. >> separate reserve_memory functions can be left for another RFE. > > The executable argument really is only false in the original, can we keep this from doing any functional changes here and keep that to separate PR:s? I'm not sure I understand. Are you proposing something else than what I proposed that we could do in a separate RFE? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562106153 From stefank at openjdk.org Fri Apr 12 07:04:44 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Apr 2024 07:04:44 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 16:40:59 GMT, Afshin Zafari wrote: >> src/hotspot/os/linux/os_linux.cpp line 4684: >> >>> 4682: char* hint = (char*)(os::Linux::initial_thread_stack_bottom() - >>> 4683: (StackOverflow::stack_guard_zone_size() + page_size)); >>> 4684: char* codebuf = os::attempt_reserve_memory_at(hint, page_size, false, mtInternal); >> >> Should these be `mtInternal` or is there a `mtStack` that is more suitable? > > In line 4699, a few lines later, the original developer used `mtInternal`. I copied it here too. OK. Then this is fine for now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562106768 From gli at openjdk.org Fri Apr 12 07:33:44 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 12 Apr 2024 07:33:44 GMT Subject: RFR: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: <-xLgCNXd5LRjsrQyE83wp_gIuWibQiWc4LF2Tg6Vkco=.17f6822f-4833-43f1-83bc-c8c8c225426d@github.com> On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. Marked as reviewed by gli (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18696#pullrequestreview-1996085236 From ayang at openjdk.org Fri Apr 12 07:40:48 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Apr 2024 07:40:48 GMT Subject: RFR: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18696#issuecomment-2051189211 From ayang at openjdk.org Fri Apr 12 07:40:48 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 Apr 2024 07:40:48 GMT Subject: Integrated: 8329962: Remove CardTable::invalidate In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:44:22 GMT, Albert Mingkun Yang wrote: > Simple converting redundant if-check to assert. This pull request has now been integrated. Changeset: 006a516a Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/006a516aa0e10d74ffafca2e2da2ae89faf47457 Stats: 13 lines in 3 files changed: 1 ins; 11 del; 1 mod 8329962: Remove CardTable::invalidate Reviewed-by: tschatzl, gli ------------- PR: https://git.openjdk.org/jdk/pull/18696 From stuefe at openjdk.org Fri Apr 12 07:42:42 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Apr 2024 07:42:42 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <2cBgaKxlWyMBVMct-_dRFueGfbEVsSNnrYfNAt8w82E=.f39d1048-574a-473d-8979-b15494364491@github.com> On Thu, 11 Apr 2024 16:27:51 GMT, Afshin Zafari wrote: > > Another idea: To alleviate the need to pass MEMFLAGS all the time, could we have something like a "active MEMFLAGS" state per Thread, and set that stack-based with a XXMark object? That way, one could say at the entrance of Metaspace, for instance, "whatever is allocated under the scope of this function, please mark with mtMetaspace". > > Not sure if I understood your idea, the question is if a thread always uses only ONE type of memory and not mix of them? For example, CDS uses both mtClass and mtClassShared. If a Thread has an active MEMFLAGS, it has to switch this flag between A and B whenever it uses type A or B. No, the idea was to do it stack-based with a chained mark object. But never mind, we can do this in a follow up PR. Maybe it has no merit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2051192960 From tobias.hartmann at oracle.com Fri Apr 12 07:47:37 2024 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 12 Apr 2024 09:47:37 +0200 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <7a8c20c7-7bd8-41fd-9c1b-343d829b698e@oracle.com> Vote: yes Best regards, Tobias On 11.04.24 15:24, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor > since the early days of OpenJDK.? > > The history of his contributions has been mangled by various SCM moves and repo consolidations over > the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in > the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte > buffers), and more recently has been active in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on this nomination.? Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]?https://github.com/openjdk/jdk/commits/master/?author=adinn > > [2]?https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > > [3]?https://openjdk.org/census#members > [4]?https://openjdk.org/groups/#member-vote > From tobias.hartmann at oracle.com Fri Apr 12 07:49:39 2024 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 12 Apr 2024 09:49:39 +0200 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: <10500248-7157-44c6-9486-0b9aa77d2190@oracle.com> Vote: yes Best regards, Tobias On 10.04.24 14:24, Jesper Wilhelmsson wrote: > I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. > > Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From tobias.hartmann at oracle.com Fri Apr 12 07:49:47 2024 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 12 Apr 2024 09:49:47 +0200 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: Vote: yes Best regards, Tobias On 10.04.24 14:24, Jesper Wilhelmsson wrote: > I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. > > Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From dlong at openjdk.org Fri Apr 12 07:50:47 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 12 Apr 2024 07:50:47 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v12] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 19:51:11 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well src/hotspot/cpu/aarch64/aarch64.ad line 3666: > 3664: Label miss; > 3665: C2_MacroAssembler _masm(&cbuf); > 3666: __ mov(result_reg, 1); I couldn't figure out what this is doing. It looks like result_reg is always R5 and will always be non-zero after check_klass_subtype_slow_path. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562166666 From stefank at openjdk.org Fri Apr 12 07:52:52 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Apr 2024 07:52:52 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 11 Apr 2024 21:25:55 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > review comments applied. A few more comments. src/hotspot/os/windows/os_windows.cpp line 3137: > 3135: // If reservation failed, return null > 3136: if (p_buf == nullptr) return nullptr; > 3137: MemTracker::record_virtual_memory_reserve((address)p_buf, size_of_reserve, CALLER_PC, mtInternal); I think that allocate_pages_individually should take a MEMFLAGS argument instead of using mtInternal here. src/hotspot/os/windows/os_windows.cpp line 3198: > 3196: // the release. > 3197: MemTracker::record_virtual_memory_reserve((address)p_buf, > 3198: bytes_to_release, CALLER_PC, mtNone); I don't think we should ever use `mtNone` in code outside of the NMT code. If you follow my suggestion above that allocate_pages_individually should take a MEMFLAG arg, then it could be used here. src/hotspot/os/windows/os_windows.cpp line 3218: > 3216: MemTracker::record_virtual_memory_reserve_and_commit((address)p_buf, bytes, CALLER_PC); > 3217: } else { > 3218: MemTracker::record_virtual_memory_reserve((address)p_buf, bytes, CALLER_PC, mtNone); Use the correct MEMFLAG here instead of mtNone. src/hotspot/os/windows/os_windows.cpp line 3771: > 3769: if (!is_committed) { > 3770: commit_memory_or_exit(addr, bytes, prot == MEM_PROT_RWX, > 3771: "cannot commit protection page", mtNone); This should probably be something else than mtNone. src/hotspot/share/jfr/recorder/storage/jfrVirtualMemory.cpp line 107: > 105: _rs = ReservedSpace(reservation_size_request_bytes, > 106: os::vm_allocation_granularity(), > 107: os::vm_page_size(), mtTracing); The mtTracing should probably be on a separate line, so that it follows the style of the surrounding code. src/hotspot/share/memory/virtualspace.cpp line 45: > 43: // Dummy constructor > 44: ReservedSpace::ReservedSpace() : _base(nullptr), _size(0), _noaccess_prefix(0), > 45: _alignment(0), _special(false), _fd_for_heap(-1), _nmt_flag(mtNone), _executable(false) { In almost all code we pass in the executable before the flag, but in ReservedSpace the flag is located before the executable. I think it would be nice to flip the order in this class. I understand that _executable is in the private section, while the other members are protected, but I don't think that it needs to be that way. The _executable could probably just be moved together with the rest of the members. OTOH, I think the entire class needs some cleanups. Let's leave this for a separate RFE. src/hotspot/share/memory/virtualspace.cpp line 615: > 613: > 614: ReservedHeapSpace::ReservedHeapSpace(size_t size, size_t alignment, size_t page_size, const char* heap_allocation_directory) : ReservedSpace() { > 615: set_nmt_flag(mtJavaHeap); It seems odd that we only initialize the _nmt_flag when `size == 0`. Could this be done after that check? If not, why not? There's also a call to record_virtual_memory_type further down in the code. Why is that needed? Why isn't it enough to pass in the correct type to the os::reserve_memory call in the initialize function? src/hotspot/share/memory/virtualspace.cpp line 672: > 670: size_t rs_align, > 671: size_t rs_page_size) : ReservedSpace() { > 672: set_nmt_flag(mtCode); Why isn't this a part of the initialize call? This looks like a bug to me. `initialize` will call clear_members, which will undo this setting. src/hotspot/share/memory/virtualspace.cpp line 708: > 706: assert(max_commit_granularity > 0, "Granularity must be non-zero."); > 707: > 708: _nmt_flag = rs.nmt_flag(); The code seems to be written with blank lines to separate various members that belong together. Please add a blank line after this line. src/hotspot/share/memory/virtualspace.hpp line 72: > 70: > 71: MEMFLAGS nmt_flag() { return _nmt_flag; } > 72: void set_nmt_flag(MEMFLAGS flag) { _nmt_flag = flag; } I have a feeling that set_nmt_flag should not exist and be replaced by updated initialize functions. src/hotspot/share/memory/virtualspace.hpp line 199: > 197: size_t _upper_alignment; > 198: > 199: MEMFLAGS _nmt_flag; The VirtualSpace::initialize functions used to initialize these members in the order that they are specified here. That is now messed up by adding the _nmt_flag at the end here, but in the beginning in the initialize function. I would propose that you move it to after _executable, both here and in the initialize function. src/hotspot/share/nmt/virtualMemoryTracker.hpp line 307: > 305: > 306: ReservedMemoryRegion(address base, size_t size, MEMFLAGS flag) : > 307: VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(flag) { } The function above uses mtNone. I find that a bit dubious, but I understand that it is done to be able to write code like this: ReservedMemoryRegion* rmr = VirtualMemoryTracker::_reserved_regions->find(ReservedMemoryRegion(addr, size)); Unfortunately, it opens up the door for people to accidentally use that version instead of this new version that you have written. Could we get rid of the version using mtNone somehow? The same question goes for the version above that, which has a "MEMFLAGS flag = mtNone". (GH doesn't allow me to comment on lines that you haven't changed) src/hotspot/share/runtime/os.hpp line 511: > 509: // and is added to be used for implementation of -XX:AllocateHeapAt > 510: static char* map_memory_to_file(size_t size, int fd, MEMFLAGS flag = mtNone); > 511: static char* map_memory_to_file_aligned(size_t size, size_t alignment, int fd, MEMFLAGS flag); There are still a few mtNone usages in this file. test/hotspot/gtest/gc/g1/test_freeRegionList.cpp line 53: > 51: size_t bot_size = G1BlockOffsetTable::compute_size(heap.word_size()); > 52: HeapWord* bot_data = NEW_C_HEAP_ARRAY(HeapWord, bot_size, mtGC); > 53: ReservedSpace bot_rs(G1BlockOffsetTable::compute_size(heap.word_size()), mtGC); mtGC => mtTest? test/hotspot/gtest/gc/z/test_zForwarding.cpp line 103: > 101: _reserved = reserved; > 102: > 103: os::commit_memory((char*)_reserved, ZGranuleSize, !ExecMem /* executable */, mtGC); mtGC => mtTest? test/hotspot/gtest/gc/z/test_zForwarding.cpp line 114: > 112: ZGeneration::_young = _old_young; > 113: if (_reserved != nullptr) { > 114: os::uncommit_memory((char*)_reserved, ZGranuleSize, !ExecMem, mtGC); mtGC => mtTest? test/hotspot/gtest/memory/test_virtualspace.cpp line 223: > 221: return ReservedSpace(reserve_size_aligned, > 222: os::vm_allocation_granularity(), > 223: os::vm_page_size(), mtTest); newline before mtTest. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-1996032947 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562112282 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562114435 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562114730 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562116176 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562126580 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562134024 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562150698 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562152813 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562154150 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562155538 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562158292 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562163590 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562165152 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562165753 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562166089 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562166154 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562166709 From stefank at openjdk.org Fri Apr 12 07:52:53 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Apr 2024 07:52:53 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 18:04:48 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/virtualspace.cpp line 693: >> >>> 691: _special = false; >>> 692: _executable = false; >>> 693: _nmt_flag = mtNone; >> >> Weird indentation. > > Fixed. Still looks weird when to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562153193 From matthias.baesken at sap.com Fri Apr 12 07:58:42 2024 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Apr 2024 07:58:42 +0000 Subject: RFO: a tool to analyze HotSpot fatal error logs In-Reply-To: References: Message-ID: Hello Maxim, this sounds like an interesting project. Does the tool work both with hserr files (?HotSpot fatal error logs?) and also the output of jcmd VM.info ? How well can it handle incomplete hserr files (we sometimes see those in case of bad crashes) ? Can the tool ?mix in? / augment additional information into the views of the error log (like a bit of source code or links into the stack traces for example) ? I thought about creating s similar tool myself in the past, but did not happen so far ? > If there is sufficient interest in creating a public and/or open-source variant of this internal plugin, I will pitch the idea to my employer. > It shouldn't be too much work to create a public version. Sounds like a great idea ! Best regards, Matthias Von: hotspot-dev > im Auftrag von Maxim Kartashev > Datum: Donnerstag, 11. April 2024 um 16:06 An: discuss at openjdk.org >, hotspot-dev at openjdk.org > Betreff: RFO: a tool to analyze HotSpot fatal error logs Sie erhalten nicht oft eine E-Mail von maxim.kartashev at jetbrains.com. Erfahren Sie, warum dies wichtig ist Hello, I am writing to inquire about the potential interest of the people involved in inspecting HotSpot crashes in a tool aimed at facilitating that inspection. We at JetBrains have developed an internal plugin that helps both with filtering through dozens of reports quickly in order to find a pattern and for diving deep into a particular crash. In addition to the "standard" features such as syntax highlighting, folding, and structural navigation, it will * highlight potential problems such as overloaded CPU, low physical memory, the presence of OOME in the recent exceptions, LD_LIBRARY_PATH being set, etc, * generate an "executive summary" for a high-level overview, for example, by front-line support, * pop up a tooltip for any recognized address describing its origin (for example, if it belongs to some thread's stack, the Java heap, a register, or a memory-mapped region), * provide the ability to highlight all addresses "near" the selected address, including registers, threads, and memory-mapped regions. If there is sufficient interest in creating a public and/or open-source variant of this internal plugin, I will pitch the idea to my employer. It shouldn't be too much work to create a public version. Kind regards, Maxim. References: * https://docs.oracle.com/javase/10/troubleshoot/fatal-error-log.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From duke at openjdk.org Fri Apr 12 08:00:20 2024 From: duke at openjdk.org (kuaiwei) Date: Fri, 12 Apr 2024 08:00:20 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request incrementally with one additional commit since the last revision: Fix arm build error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18467/files - new: https://git.openjdk.org/jdk/pull/18467/files/e2d3e1e4..4bd183fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=05-06 Stats: 5 lines in 1 file changed: 1 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From stuefe at openjdk.org Fri Apr 12 08:03:46 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Apr 2024 08:03:46 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <4eN_yJUIi_0MTBROX0yxeIZIYo4W3KNlBGGOSA3glI4=.8e6ec837-1cb3-414f-959c-86fb3e3c9907@github.com> On Thu, 11 Apr 2024 21:25:55 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > review comments applied. Mostly good. Small nits inside. src/hotspot/share/memory/metaspace/testHelpers.cpp line 81: > 79: if (reserve_limit > 0) { > 80: // have reserve limit -> non-expandable context > 81: _rs = ReservedSpace(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(), os::vm_page_size(), mtTest); mtMetaspace src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 112: > 110: > 111: // Commit... > 112: if (os::commit_memory((char*)p, word_size * BytesPerWord, !ExecMem, _rs.nmt_flag()) == false) { just use mtMetaspace here, its easier src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 191: > 189: > 190: // Uncommit... > 191: if (os::uncommit_memory((char*)p, word_size * BytesPerWord, !ExecMem, _rs.nmt_flag()) == false) { mtMetaspace src/hotspot/share/runtime/os.hpp line 521: > 519: bool allow_exec = false, MEMFLAGS flags = mtNone); > 520: static bool unmap_memory(char *addr, size_t bytes); > 521: static void free_memory(char *addr, size_t bytes, size_t alignment_hint, MEMFLAGS flag); While looking at this, I noticed a couple of odd things about this function. I think it should be revised and I opened https://bugs.openjdk.org/browse/JDK-8330144. The result of that revision will be that we don't need MEMFLAGS, nor do would we need the alignment hint. But leave the MEMFLAGS in for now. If I happen to push that change first, you can adapt the change, if you push first I'll manage. ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-1996103064 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562159668 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562159155 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562159372 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562178816 From stefank at openjdk.org Fri Apr 12 08:18:43 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Apr 2024 08:18:43 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Thu, 11 Apr 2024 18:05:57 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/virtualspace.hpp line 45: >> >>> 43: bool _special; >>> 44: int _fd_for_heap; >>> 45: MEMFLAGS _nmt_flag; >> >> Indentation is now off. > > Fixed. It still looks wrong. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1562197284 From syan at openjdk.org Fri Apr 12 08:28:54 2024 From: syan at openjdk.org (SendaoYan) Date: Fri, 12 Apr 2024 08:28:54 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v3] In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao SendaoYan has updated the pull request incrementally with one additional commit since the last revision: If isCgroupV1() return true, add docker run opts --memory-swappiness=60 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18225/files - new: https://git.openjdk.org/jdk/pull/18225/files/4a9f3881..ecb71597 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=01-02 Stats: 34 lines in 1 file changed: 25 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18225.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18225/head:pull/18225 PR: https://git.openjdk.org/jdk/pull/18225 From shade at openjdk.org Fri Apr 12 08:37:16 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 08:37:16 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal Message-ID: This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. Additional testing: - [x] MacOS AArch64 server fastdebug build passes - [ ] Linux x86_64 server fastdebug, `all` - [ ] Linux AArch64 server fastdebug, `all` ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/18751/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328934 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18751/head:pull/18751 PR: https://git.openjdk.org/jdk/pull/18751 From shade at openjdk.org Fri Apr 12 08:40:54 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 08:40:54 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v2] In-Reply-To: References: Message-ID: > This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. > > Additional testing: > - [x] MacOS AArch64 server fastdebug build passes > - [ ] Linux x86_64 server fastdebug, `all` > - [ ] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Need explicit include as well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18751/files - new: https://git.openjdk.org/jdk/pull/18751/files/ff074a6a..59f3aaab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18751/head:pull/18751 PR: https://git.openjdk.org/jdk/pull/18751 From syan at openjdk.org Fri Apr 12 08:42:45 2024 From: syan at openjdk.org (SendaoYan) Date: Fri, 12 Apr 2024 08:42:45 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v3] In-Reply-To: <3lEEmi-4SOpmVy_SqInD8q1BReMYdmhBqszkskNnqbk=.602870f7-d94b-4b33-9b8a-35002bfef4f3@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> <3lEEmi-4SOpmVy_SqInD8q1BReMYdmhBqszkskNnqbk=.602870f7-d94b-4b33-9b8a-35002bfef4f3@github.com> Message-ID: On Thu, 11 Apr 2024 15:49:42 GMT, Severin Gehwolf wrote: >> Thanks for your review. The space after `//` has been added. >> >> I can't reproduce the "OCI runtime error" failure on mine ubuntu22 environment. >> It seems that ubuntu22 use cgroups v2 by default. >> ![image](https://github.com/openjdk/jdk/assets/24123821/d41934fe-afb4-45a7-abd4-df4070123bb2) >> >> Can you show your host machine enviroment information, so I can reproduce the same failure. After that I will try to find a solution with cgroupv2. > > It seems to be podman runtime specific. `crun` fails, `runc` doesn't seem to be. Either way, the corresponding interface file, `memory.swappiness` doesn't exist for cgroup v2. Try `podman run --runtime /usr/bin/crun --rm -ti --memory-swappiness=60 fedora:39` provided the `crun` runtime is installed in `/usr/bin`. @jerboaa Thanks for the review. The `--memory-swappiness=60` option only added when the cgroup version is v1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18225#discussion_r1562232109 From maxim.kartashev at jetbrains.com Fri Apr 12 08:47:47 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Fri, 12 Apr 2024 12:47:47 +0400 Subject: RFO: a tool to analyze HotSpot fatal error logs In-Reply-To: References: Message-ID: > Does the tool work both with hserr files (?HotSpot fatal error logs?) and also the output of jcmd VM.info ? Yes since the latter is more or less a short version of the former. > How well can it handle incomplete hserr files (we sometimes see those in case of bad crashes) ? As well as can be expected; some of the crashes are naturally redacted because of induced crashes and we're taking whatever information is there. We're also processing crashes from many JVM versions each of which introduces its own variance to the content and format of the log. There are many heuristics in the parsing so it can yield incorrect results (rarely), but it also makes parsing quite stable in the sense that practically any log a human can read the tool can also read. > Can the tool ?mix in? / augment additional information into the views of the error log (like a bit of source code or links into the stack traces for example) ? You can go to the declaration of the PasswordAuthentication class when looking at things like Event: 112.303 loading class java/net/PasswordAuthentication I haven't thought of opening source files via links from the log itself since I rarely see them there (mostly assertion failures), but it's easy enough to implement. > I thought about creating s similar tool myself in the past, but did not happen so far ? I think this is the story for many on the list. I've seen enough people on youtube turning to ad hoc scripts to make sense of addresses in the logs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at openjdk.org Fri Apr 12 09:00:47 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 09:00:47 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v12] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 07:48:30 GMT, Dean Long wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/cpu/aarch64/aarch64.ad line 3666: > >> 3664: Label miss; >> 3665: C2_MacroAssembler _masm(&cbuf); >> 3666: __ mov(result_reg, 1); > > I couldn't figure out what this is doing. It looks like result_reg is always R5 and will always be non-zero after check_klass_subtype_slow_path. Ah yes, that's true. This is a change I made during some experiments to reorganize check_klass_subtype_slow_path. My experiments were failing tests in appaarently-random ways, and eventually I discovered that there was an almost-undocumented assumption that R5 would always be set to something nonzero. My change did a check on the bitmap, then branched to the failure label. Doing that didn't work if the contents of R5 were previously zero. The code is misleading: result_reg (R5) is passed to check_klass_subtype_slow_path as temp2_reg, with no indication that the caller _requires_ that `temp2_reg` must be set. So `temp2_reg` isn't just a temp, it's also an output. I should add a couple of comments and remove this mov instruction. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562257598 From sgehwolf at openjdk.org Fri Apr 12 09:09:43 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 12 Apr 2024 09:09:43 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v3] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: On Fri, 12 Apr 2024 08:28:54 GMT, SendaoYan wrote: >> Hi, >> >> According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. >> >> When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: >> >> 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` >> 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. >> 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. >> >> Maybe the 2rd method seems more resonable. >> >> >> Thanks, >> -sendao > > SendaoYan has updated the pull request incrementally with one additional commit since the last revision: > > If isCgroupV1() return true, add docker run opts --memory-swappiness=60 I don't think we need `Whitebox`. A simpler check would be to use the `Metrics` class. See for example `test/hotspot/jtreg/containers/docker/TestMemoryWithCgroupV1.java`. ------------- PR Review: https://git.openjdk.org/jdk/pull/18225#pullrequestreview-1996282528 From aph at openjdk.org Fri Apr 12 09:11:47 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 09:11:47 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v2] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 08:40:54 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Need explicit include as well src/hotspot/share/utilities/globalDefinitions.hpp line 1110: > 1108: > 1109: template inline T ABS(T x) { > 1110: assert(x != std::numeric_limits::min(), "ABS: argument should not allow overflow"); Shouldn't this check for an integral type? This code makes no sense for floating-point types. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1562275062 From shade at openjdk.org Fri Apr 12 09:21:54 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 09:21:54 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: > This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. > > Additional testing: > - [x] MacOS AArch64 server fastdebug build passes > - [ ] Linux x86_64 server fastdebug, `all` > - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests > - [ ] Linux x86_64 server fastdebug, Maven CTW > - [ ] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Only assert integral type arguments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18751/files - new: https://git.openjdk.org/jdk/pull/18751/files/59f3aaab..730b18af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18751/head:pull/18751 PR: https://git.openjdk.org/jdk/pull/18751 From shade at openjdk.org Fri Apr 12 09:21:55 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 09:21:55 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v2] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 09:09:12 GMT, Andrew Haley wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Need explicit include as well > > src/hotspot/share/utilities/globalDefinitions.hpp line 1110: > >> 1108: >> 1109: template inline T ABS(T x) { >> 1110: assert(x != std::numeric_limits::min(), "ABS: argument should not allow overflow"); > > Shouldn't this check for an integral type? This code makes no sense for floating-point types. Yeah, true. Amended in new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1562286178 From aph at openjdk.org Fri Apr 12 09:44:41 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 09:44:41 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 09:21:54 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Only assert integral type arguments Looks good. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18751#pullrequestreview-1996350017 From syan at openjdk.org Fri Apr 12 09:47:55 2024 From: syan at openjdk.org (SendaoYan) Date: Fri, 12 Apr 2024 09:47:55 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v4] In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao SendaoYan has updated the pull request incrementally with one additional commit since the last revision: use Metrics.class get the CgroupV1 information ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18225/files - new: https://git.openjdk.org/jdk/pull/18225/files/ecb71597..bd144b73 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=02-03 Stats: 28 lines in 1 file changed: 11 ins; 15 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18225.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18225/head:pull/18225 PR: https://git.openjdk.org/jdk/pull/18225 From syan at openjdk.org Fri Apr 12 09:52:44 2024 From: syan at openjdk.org (SendaoYan) Date: Fri, 12 Apr 2024 09:52:44 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v3] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: On Fri, 12 Apr 2024 09:06:46 GMT, Severin Gehwolf wrote: > I don't think we need `Whitebox`. A simpler check would be to use the `Metrics` class. See for example `test/hotspot/jtreg/containers/docker/TestMemoryWithCgroupV1.java`. Thanks your advice. The `Whitebox` has been removed, and Use `Metrics`class to get the cgroup version information. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2051434029 From shade at openjdk.org Fri Apr 12 10:05:55 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 10:05:55 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v4] In-Reply-To: References: Message-ID: > This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. > > Additional testing: > - [x] MacOS AArch64 server fastdebug build passes > - [ ] Linux x86_64 server fastdebug, `all` > - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests > - [ ] Linux x86_64 server fastdebug, Maven CTW > - [ ] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: - More straightforward - Richer error reporting ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18751/files - new: https://git.openjdk.org/jdk/pull/18751/files/730b18af..f3d75b39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=02-03 Stats: 13 lines in 1 file changed: 9 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18751/head:pull/18751 PR: https://git.openjdk.org/jdk/pull/18751 From shade at openjdk.org Fri Apr 12 10:05:56 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 10:05:56 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 09:21:54 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Only assert integral type arguments Caught some failures, which made me think we want richer diagnostics around this. With new version, we print stuff like: # Internal Error (/Users/shipilev/Work/shipilev-jdk/src/hotspot/share/opto/loopnode.cpp:2965), pid=32195, tid=27139 # Error: ABS: argument should not allow overflow ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2051449335 From aph at openjdk.org Fri Apr 12 10:05:56 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 10:05:56 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 09:21:54 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Only assert integral type arguments src/hotspot/share/utilities/globalDefinitions.hpp line 1112: > 1110: assert(!std::is_integral::value || x != std::numeric_limits::min(), > 1111: "ABS: argument should not allow overflow"); > 1112: T res = (x > 0) ? x : -x; Beware! If x is MIN_INT, GCC might delete the assertion because the negation below is UB. I'd only do the negation if it won't overflow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1562334907 From shade at openjdk.org Fri Apr 12 10:05:56 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 10:05:56 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: <8-hCiROYvQrGzO4liJ083d3M_6RSHLvjr4oTbml66ME=.30c3eb9a-e309-4d7a-927c-aea5c180c0ea@github.com> On Fri, 12 Apr 2024 09:59:09 GMT, Andrew Haley wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Only assert integral type arguments > > src/hotspot/share/utilities/globalDefinitions.hpp line 1112: > >> 1110: assert(!std::is_integral::value || x != std::numeric_limits::min(), >> 1111: "ABS: argument should not allow overflow"); >> 1112: T res = (x > 0) ? x : -x; > > Beware! If x is MIN_INT, GCC might delete the assertion because the negation below is UB. I'd only do the negation if it won't overflow. This is why we can't have nice things. So the fix would be to check for overflow first, and the do the abs, like this? template inline T asserted_abs(T x, const char* file, int line) { if (std::is_integral::value && x == std::numeric_limits::min()) { #ifdef ASSERT report_vm_error(file, line, "ABS: argument should not allow overflow"); #endif // Do not allow UB, return the overflowed value. return x; } else { T res = (x > 0) ? x : -x; #ifdef ASSERT if (res < 0) { report_vm_error(file, line, "ABS: result should be non-negative"); } #endif return res; } } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1562337745 From aph at openjdk.org Fri Apr 12 10:05:56 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 10:05:56 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: <8-hCiROYvQrGzO4liJ083d3M_6RSHLvjr4oTbml66ME=.30c3eb9a-e309-4d7a-927c-aea5c180c0ea@github.com> References: <8-hCiROYvQrGzO4liJ083d3M_6RSHLvjr4oTbml66ME=.30c3eb9a-e309-4d7a-927c-aea5c180c0ea@github.com> Message-ID: On Fri, 12 Apr 2024 10:01:46 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/utilities/globalDefinitions.hpp line 1112: >> >>> 1110: assert(!std::is_integral::value || x != std::numeric_limits::min(), >>> 1111: "ABS: argument should not allow overflow"); >>> 1112: T res = (x > 0) ? x : -x; >> >> Beware! If x is MIN_INT, GCC might delete the assertion because the negation below is UB. I'd only do the negation if it won't overflow. > > This is why we can't have nice things. So the fix would be to check for overflow first, and the do the abs, like this? > > > template inline T asserted_abs(T x, const char* file, int line) { > if (std::is_integral::value && x == std::numeric_limits::min()) { > #ifdef ASSERT > report_vm_error(file, line, "ABS: argument should not allow overflow"); > #endif > // Do not allow UB, return the overflowed value. > return x; > } else { > T res = (x > 0) ? x : -x; > #ifdef ASSERT > if (res < 0) { > report_vm_error(file, line, "ABS: result should be non-negative"); > } > #endif > return res; > } > } Or not, because GCC knows that assert may be `noreturn`? In that case never mind. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1562338233 From shade at openjdk.org Fri Apr 12 10:10:42 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 10:10:42 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: <8-hCiROYvQrGzO4liJ083d3M_6RSHLvjr4oTbml66ME=.30c3eb9a-e309-4d7a-927c-aea5c180c0ea@github.com> Message-ID: On Fri, 12 Apr 2024 10:02:06 GMT, Andrew Haley wrote: >> This is why we can't have nice things. So the fix would be to check for overflow first, and the do the abs, like this? >> >> >> template inline T asserted_abs(T x, const char* file, int line) { >> if (std::is_integral::value && x == std::numeric_limits::min()) { >> #ifdef ASSERT >> report_vm_error(file, line, "ABS: argument should not allow overflow"); >> #endif >> // Do not allow UB, return the overflowed value. >> return x; >> } else { >> T res = (x > 0) ? x : -x; >> #ifdef ASSERT >> if (res < 0) { >> report_vm_error(file, line, "ABS: result should be non-negative"); >> } >> #endif >> return res; >> } >> } > > Or not, because GCC knows that assert may be `noreturn`? In that case never mind. Yeah, this is ugly, especially since we have to decide what do we return on overflow in product bits. I'd prefer not to do this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1562344322 From sgehwolf at openjdk.org Fri Apr 12 10:16:41 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 12 Apr 2024 10:16:41 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v4] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: <7UiHS0UfgGBQs0w44kFgJKmGKbslpKyeALQxqN2MH1M=.ac447975-9ffb-4d19-8d02-3db27eb286ec@github.com> On Fri, 12 Apr 2024 09:47:55 GMT, SendaoYan wrote: >> Hi, >> >> According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. >> >> When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: >> >> 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` >> 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. >> 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. >> >> Maybe the 2rd method seems more resonable. >> >> >> Thanks, >> -sendao > > SendaoYan has updated the pull request incrementally with one additional commit since the last revision: > > use Metrics.class get the CgroupV1 information Better. I don't think we need to duplicate that much of the test. All we'd need to do is to add the `--memory-swappiness` option only if we are on cg v1: opts.addDockerOpts("--memory=" + memValueToSet) .addDockerOpts("--memory-swap=" + swapValueToSet) .addClassOptions("jdk.SwapSpace")); if (isCgroupV1) { // With Cgroupv1, The default memory-swappiness vaule is inherited from the host machine, which maybe 0 opts.addDockerOpts("--memory-swappiness=60"); } out = DockerTestUtils.dockerRunJava(opts); ... ------------- PR Review: https://git.openjdk.org/jdk/pull/18225#pullrequestreview-1996412404 From matthias.baesken at sap.com Fri Apr 12 10:33:51 2024 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 12 Apr 2024 10:33:51 +0000 Subject: RFO: a tool to analyze HotSpot fatal error logs In-Reply-To: References: Message-ID: >You can go to the declaration of the PasswordAuthentication class when looking at things like >Event: 112.303 loading class java/net/PasswordAuthentication That sounds good ! A colleague just today (in another context) pointed out the idea to have an option to select all the hserr event log sections into a single Log with chronological order . That would probably also something this tool could do (or is it already implemented) . >I haven't thought of opening source files via links from the log itself since I rarely see them there (mostly assertion failures), but it's easy enough to implement. You see them in the assertion failures or guarantees . But also in that native stacks (even with line numbers on some platforms) . See for example : # Internal Error (/openjdk-22u-linux_aarch64-dbg/jdk/src/hotspot/share/prims/jvmtiRawMonitor.cpp:174), pid=3474004, tid=3474024 # guarantee(w ->_t_state == QNode::TS_ENTER) failed: invariant # # JRE version: OpenJDK Runtime Environment (22.0.1) (fastdebug build 22.0.1-internal-adhoc.jenkinsi.jdk) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 22.0.1-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64) # Problematic frame: # V [libjvm.so+0x10d1520] JvmtiRawMonitor::simple_exit(Thread*)+0x18c # --------------- T H R E A D --------------- Current thread (0x0000ffff38001310): JavaThread "JDWP Command Reader" daemon [_thread_in_native, id=3474024, stack(0x0000ffff56d92000,0x0000ffff56f90000) (2040K)] Stack: [0x0000ffff56d92000,0x0000ffff56f90000], sp=0x0000ffff56f8e600, free space=2033k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x10d1520] JvmtiRawMonitor::simple_exit(Thread*)+0x18c (jvmtiRawMonitor.cpp:174) V [libjvm.so+0x10d2b84] JvmtiRawMonitor::raw_exit(Thread*)+0x44 (jvmtiRawMonitor.cpp:367) V [libjvm.so+0x10993ac] JvmtiEnv::RawMonitorExit(JvmtiRawMonitor*)+0xf8 (jvmtiEnv.cpp:3629) C [libjdwp.so+0x2c418] debugMonitorExit+0x38 (util.c:1023) C [libjdwp.so+0x13840] reader+0xe0 (debugLoop.c:288) V [libjvm.so+0x10c9ee0] JvmtiAgentThread::call_start_function()+0x60 (jvmtiImpl.cpp:89) V [libjvm.so+0xddd6f0] JavaThread::thread_main_inner()+0xec (javaThread.cpp:721) V [libjvm.so+0x1756bc4] Thread::call_run()+0xb0 (thread.cpp:221) V [libjvm.so+0x13aea68] thread_native_entry(Thread*)+0x138 (os_linux.cpp:789) C [libc.so.6+0x82a38] start_thread+0x2d4 For the failing asssertions / guarantees it might also be helpful to augment some info about the assertion. Best regards, Matthias From: Maxim Kartashev Sent: Friday, 12 April 2024 10:48 To: Baesken, Matthias Cc: discuss at openjdk.org; hotspot-dev at openjdk.org; Doerr, Martin ; Langer, Christoph Subject: Re: RFO: a tool to analyze HotSpot fatal error logs > Does the tool work both with hserr files (?HotSpot fatal error logs?) and also the output of jcmd VM.info ? Yes since the latter is more or less a short version of the former. > How well can it handle incomplete hserr files (we sometimes see those in case of bad crashes) ? As well as can be expected; some of the crashes are naturally redacted because of induced crashes and we're taking whatever information is there. We're also processing crashes from many JVM versions each of which introduces its own variance to the content and format of the log. There are many heuristics in the parsing so it can yield incorrect results (rarely), but it also makes parsing quite stable in the sense that practically any log a human can read the tool can also read. > Can the tool ?mix in? / augment additional information into the views of the error log (like a bit of source code or links into the stack traces for example) ? You can go to the declaration of the PasswordAuthentication class when looking at things like Event: 112.303 loading class java/net/PasswordAuthentication I haven't thought of opening source files via links from the log itself since I rarely see them there (mostly assertion failures), but it's easy enough to implement. > I thought about creating s similar tool myself in the past, but did not happen so far ? I think this is the story for many on the list. I've seen enough people on youtube turning to ad hoc scripts to make sense of addresses in the logs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rehn at openjdk.org Fri Apr 12 10:46:01 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 12 Apr 2024 10:46:01 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction Message-ID: Hi please consider! Today we check if the distance is a signed 32. As the second instruction have sign bit + 11 bits the, max of such pair is shorter. Sanity tested ------------- Commit messages: - Check true range Changes: https://git.openjdk.org/jdk/pull/18755/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330156 Stats: 11 lines in 2 files changed: 6 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18755.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18755/head:pull/18755 PR: https://git.openjdk.org/jdk/pull/18755 From mbaesken at openjdk.org Fri Apr 12 11:04:54 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 12 Apr 2024 11:04:54 GMT Subject: RFR: 8329605: hs errfile generic events - introduce sections for Frequent/NotFrequent Events [v4] In-Reply-To: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: > Currently the 'generic' hs_errfile Events message log (filled by Events::log) is rather flooded by messages for memory protection operations. Those seem to occur quite often and move out other less frequent events, because the number of entries in the log is limited. > It might be better to separate the frequent and less frequent events into 2 sections. The memory protection events would go into the frequent events section. > The mentioned memory protection operations related entries look like this : > Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: add memprotect section ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18626/files - new: https://git.openjdk.org/jdk/pull/18626/files/7de2f9e5..912afc31 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=02-03 Stats: 13 lines in 5 files changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18626/head:pull/18626 PR: https://git.openjdk.org/jdk/pull/18626 From mbaesken at openjdk.org Fri Apr 12 11:18:56 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 12 Apr 2024 11:18:56 GMT Subject: RFR: 8329605: hs errfile generic events - move memory protections and nmethod flushes to separate sections [v5] In-Reply-To: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: > Currently the 'generic' hs_errfile Events message log (filled by Events::log) is sometimes rather full by messages for memory protection operations and nmethod flushes. Those seem to occur quite often and potentially move out other less frequent events, because the number of entries in the log is limited. > It might be better to separate the events into separate sections. > > The mentioned memory protection operations related entries look like this : > Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: use nmethodflushes not nmethodflushs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18626/files - new: https://git.openjdk.org/jdk/pull/18626/files/912afc31..77f64c89 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18626&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18626.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18626/head:pull/18626 PR: https://git.openjdk.org/jdk/pull/18626 From stefank at openjdk.org Fri Apr 12 11:18:56 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 Apr 2024 11:18:56 GMT Subject: RFR: 8329605: hs errfile generic events - move memory protections and nmethod flushes to separate sections [v4] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: <2eTkjO46WUlzYLRQatfZnYoBEPevgecKyZX5P1RBsPc=.2275236b-5c7c-4f48-a524-9eef45fe450f@github.com> On Fri, 12 Apr 2024 11:04:54 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is sometimes rather full by messages for memory protection operations and nmethod flushes. Those seem to occur quite often and potentially move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the events into separate sections. >> >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > add memprotect section Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18626#pullrequestreview-1996504831 From mbaesken at openjdk.org Fri Apr 12 11:18:56 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 12 Apr 2024 11:18:56 GMT Subject: RFR: 8329605: hs errfile generic events - move memory protections and nmethod flushes to separate sections [v4] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Fri, 12 Apr 2024 11:04:54 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is sometimes rather full by messages for memory protection operations and nmethod flushes. Those seem to occur quite often and potentially move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the events into separate sections. >> >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > add memprotect section I added a memprotect section. Hi Stefan, thanks for the review ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2051562503 PR Comment: https://git.openjdk.org/jdk/pull/18626#issuecomment-2051562772 From stuefe at openjdk.org Fri Apr 12 11:44:43 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 12 Apr 2024 11:44:43 GMT Subject: RFR: 8329605: hs errfile generic events - move memory protections and nmethod flushes to separate sections [v5] In-Reply-To: References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: On Fri, 12 Apr 2024 11:18:56 GMT, Matthias Baesken wrote: >> Currently the 'generic' hs_errfile Events message log (filled by Events::log) is sometimes rather full by messages for memory protection operations and nmethod flushes. Those seem to occur quite often and potentially move out other less frequent events, because the number of entries in the log is limited. >> It might be better to separate the events into separate sections. >> >> The mentioned memory protection operations related entries look like this : >> Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > use nmethodflushes not nmethodflushs Good, thank you ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18626#pullrequestreview-1996577069 From sspitsyn at openjdk.org Fri Apr 12 12:00:55 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Apr 2024 12:00:55 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v3] In-Reply-To: References: Message-ID: > The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: remove unneded check for is_vthread_alive; do not call do_thread from do_vthread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18630/files - new: https://git.openjdk.org/jdk/pull/18630/files/39717f37..86775376 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=01-02 Stats: 8 lines in 1 file changed: 0 ins; 5 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18630.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18630/head:pull/18630 PR: https://git.openjdk.org/jdk/pull/18630 From coleenp at openjdk.org Fri Apr 12 12:19:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 12 Apr 2024 12:19:47 GMT Subject: RFR: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code [v4] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 17:40:14 GMT, Coleen Phillimore wrote: >> This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. >> >> With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > More comment updates. Thank you for your review and comments, Erik and David, and your review and help with this PR, Kim. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18375#issuecomment-2051648308 From coleenp at openjdk.org Fri Apr 12 12:19:48 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 12 Apr 2024 12:19:48 GMT Subject: Integrated: 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code In-Reply-To: References: Message-ID: <4QkNq4g1mnGebHx2bObvaCna4BjxD9U0Jlbuf58qBkA=.6c63d1ab-da93-40a9-bfd6-7d86d085be19@github.com> On Tue, 19 Mar 2024 12:19:44 GMT, Coleen Phillimore wrote: > This patch gives the ServiceThread a periodic wakeup (same as GuaranteedSafepointInterval) to check if it needs to clean out OopStorage blocks, and move the triggering of this cleaning out of the safepoint cleanup tasks. Since ICBuffer, StringTable and SymbolTable rehashing have moved, there's nothing that actually triggers the nop safepoint to do cleaning (except SafepointALot), so the OopStorage cleanup won't be triggered. > > With moving all of these out of the safepoint cleanup tasks, we can remove the code that sets up multiple threads to do safepoint cleanup. We can also remove the JFR events and logging that times safepoint cleanup, and a logging test. > > Tested with tier1-4. This pull request has now been integrated. Changeset: 3e9c3811 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/3e9c3811669196945d7227affc28728670a256c5 Stats: 291 lines in 14 files changed: 10 ins; 237 del; 44 mod 8329488: Move OopStorage code from safepoint cleanup and remove safepoint cleanup code Reviewed-by: kbarrett, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/18375 From mli at openjdk.org Fri Apr 12 12:21:02 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 12 Apr 2024 12:21:02 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub Message-ID: Hi, Can you help to review this patch? As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. Thanks! Tests running ... ------------- Commit messages: - remove csrwi(..., rne) - Initial commit Changes: https://git.openjdk.org/jdk/pull/18758/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330094 Stats: 26 lines in 3 files changed: 13 ins; 9 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18758.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18758/head:pull/18758 PR: https://git.openjdk.org/jdk/pull/18758 From maxim.kartashev at jetbrains.com Fri Apr 12 12:45:45 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Fri, 12 Apr 2024 16:45:45 +0400 Subject: RFO: a tool to analyze HotSpot fatal error logs In-Reply-To: References: Message-ID: > A colleague just today (in another context) pointed out the idea to have an option to select all the hserr event log sections into a single > Log with chronological order . That would probably also something this tool could do (or is it already implemented) . Not implemented as such, but certainly possible with some effort. If the tool is open-sourced such customization will be a lot easier on everybody. > You see them in the assertion failures or guarantees . But also in that native stacks (even with line numbers on some platforms) . Right. Having that in stacks is a relatively recent development, so we simply haven't caught up yet. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fyang at openjdk.org Fri Apr 12 12:56:44 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 12 Apr 2024 12:56:44 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction In-Reply-To: References: Message-ID: <2P25cnaLF4Xc3i8CY-RmNcQPKMCuArKx6krvqtQuvs4=.f5ad5f9a-94bc-4975-8f4e-b522fd5b76ca@github.com> On Fri, 12 Apr 2024 10:41:39 GMT, Robbin Ehn wrote: > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested Thanks for fixing this. I have a question about the range checking. src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 684: > 682: int64_t twoG = (2 * G); > 683: int64_t twoK = (2 * K); > 684: return x <= (twoG - twoK) && x >= (-twoG + twoK); As I remembered, the true range of RISC-V PC-relative addressing should be: [-2^31 - 2^11, 2^31 - 2^11). See [1]. [1] https://patchwork.kernel.org/project/linux-riscv/patch/20220131182145.236005-3-kernel at esmil.dk/ ------------- PR Review: https://git.openjdk.org/jdk/pull/18755#pullrequestreview-1996854823 PR Review Comment: https://git.openjdk.org/jdk/pull/18755#discussion_r1562508950 From rehn at openjdk.org Fri Apr 12 13:07:54 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 12 Apr 2024 13:07:54 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps Message-ID: Hi please consider! jal do not have C switch, we always use the full length instructions. But jalr have, in case of an unbound Label which is to far for jal we can emit c_jalr. When we bind the Label we can't patch the c_jalr. Sanity tested. ------------- Commit messages: - Use IncompressibleRegion Changes: https://git.openjdk.org/jdk/pull/18761/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18761&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330161 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18761/head:pull/18761 PR: https://git.openjdk.org/jdk/pull/18761 From ddong at openjdk.org Fri Apr 12 13:08:06 2024 From: ddong at openjdk.org (Denghui Dong) Date: Fri, 12 Apr 2024 13:08:06 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v11] In-Reply-To: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: > There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing. > > Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production. > > ~~This patch adds a new JFR event `EventSafepointTimeout` to record the threads that cause ttsp too long.~~ > > ~~This event includes two fields:~~ > > ~~- safepointId: the relevant safepoint id~~ > ~~- timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint~~ > > ~~In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.~~ > > A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR. > > Any input would be greatly appreciated. > > Testing: jdk/jdk/jfr Denghui Dong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Merge branch 'master' into JDK-8326012 - update - delete _entries when disabled - fix test failures - update - refactor - update - update - update - update - ... and 11 more: https://git.openjdk.org/jdk/compare/0f78d017...df58b055 ------------- Changes: https://git.openjdk.org/jdk/pull/17888/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17888&range=10 Stats: 360 lines in 12 files changed: 353 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17888/head:pull/17888 PR: https://git.openjdk.org/jdk/pull/17888 From sgibbons at openjdk.org Fri Apr 12 13:46:46 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 13:46:46 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v13] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <2W-3EZqHS-07qzZ4RS72u33Hav0LMRfeIG4QPAyvk10=.8a35e043-6803-42d5-8ea0-bff5378a8c50@github.com> Message-ID: On Fri, 12 Apr 2024 00:14:29 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Addressing yet more review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2504: > >> 2502: Label L_exit, L_fillQuadwords, L_fillDwords, L_fillBytes; >> 2503: >> 2504: setup_arg_regs(3); > > A comment stating the placement of dest, size, and byteVal after call to setup_arg_regs() would be very helpful. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1562562649 From sgibbons at openjdk.org Fri Apr 12 13:46:47 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 13:46:47 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v13] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <2W-3EZqHS-07qzZ4RS72u33Hav0LMRfeIG4QPAyvk10=.8a35e043-6803-42d5-8ea0-bff5378a8c50@github.com> Message-ID: On Fri, 12 Apr 2024 00:25:34 GMT, Sandhya Viswanathan wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2521: >> >>> 2519: const Register byteVal = rdx; >>> 2520: >>> 2521: // Propagate byte to full Register >> >> The comment refers to lines 2524-2526, please move it down. > > Still continuing to look through StubGenerator::generate_unsafe_setmemory(), more comments to come. Thank you for your patience. Comment moved. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1562563155 From sgibbons at openjdk.org Fri Apr 12 13:46:48 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 13:46:48 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v12] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 12 Apr 2024 00:23:33 GMT, Sandhya Viswanathan wrote: >> It would not be appropriate to add set memory marks to the existing _jbyte_fill as it is being used by other routines, and the effect of the mark will be very hard to track down (if any). >> >> Are you *sure* we want to do that? > > Yes we want to do that. It is all guarded by thread->doing_unsafe_access() which is only true when we are getting to this code from unsafe. Similar technique is used in copyMemory as well. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1562573673 From aph at openjdk.org Fri Apr 12 14:03:43 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 14:03:43 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 09:59:42 GMT, Aleksey Shipilev wrote: > Caught some failures, which made me think we want richer diagnostics around this. With new version, we print stuff like: > > ``` > # Internal Error (/Users/shipilev/Work/shipilev-jdk/src/hotspot/share/opto/loopnode.cpp:2965), pid=32195, tid=27139 > # Error: ABS: argument should not allow overflow > ``` LOL, don't say you weren't warned! ;-) T res = (x < 0 && x != std::numeric_limits::min()) ? -x : x; ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2051820875 From mbaesken at openjdk.org Fri Apr 12 14:11:46 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 12 Apr 2024 14:11:46 GMT Subject: Integrated: 8329605: hs errfile generic events - move memory protections and nmethod flushes to separate sections In-Reply-To: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> References: <5GN6AKI0ud3DgU7-RX2-12eu87Me8jhzKXA-L8BwR04=.384ddd36-1a8f-40ac-9387-5d8d97c37fe3@github.com> Message-ID: <2qFI6TfrgJgqpYgIcA-q0buFtZ5J6_sDtsrQ1SWypp8=.fe5bcbc5-47ad-4d97-a758-b386cd821210@github.com> On Thu, 4 Apr 2024 12:34:19 GMT, Matthias Baesken wrote: > Currently the 'generic' hs_errfile Events message log (filled by Events::log) is sometimes rather full by messages for memory protection operations and nmethod flushes. Those seem to occur quite often and potentially move out other less frequent events, because the number of entries in the log is limited. > It might be better to separate the events into separate sections. > > The mentioned memory protection operations related entries look like this : > Event: 0.178 Protecting memory [0x000000016ebf0000,0x000000016ebfc000] with protection modes 0 This pull request has now been integrated. Changeset: 397d9483 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/397d94831033e91c7a849774bf4e80d8f1c8ec66 Stats: 40 lines in 6 files changed: 32 ins; 0 del; 8 mod 8329605: hs errfile generic events - move memory protections and nmethod flushes to separate sections Reviewed-by: lucy, stefank, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/18626 From rehn at openjdk.org Fri Apr 12 14:16:42 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 12 Apr 2024 14:16:42 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction In-Reply-To: <2P25cnaLF4Xc3i8CY-RmNcQPKMCuArKx6krvqtQuvs4=.f5ad5f9a-94bc-4975-8f4e-b522fd5b76ca@github.com> References: <2P25cnaLF4Xc3i8CY-RmNcQPKMCuArKx6krvqtQuvs4=.f5ad5f9a-94bc-4975-8f4e-b522fd5b76ca@github.com> Message-ID: <0UFJFDR70MZHXuZGNTZGnuKbC3wa-7wNUbULa9ZV7N8=.ea17a14f-f7ff-430e-a6fb-97570afa40af@github.com> On Fri, 12 Apr 2024 12:52:12 GMT, Fei Yang wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 684: > >> 682: int64_t twoG = (2 * G); >> 683: int64_t twoK = (2 * K); >> 684: return x <= (twoG - twoK) && x >= (-twoG + twoK); > > As I remembered, the true range of RISC-V PC-relative addressing should be: [-2^31 - 2^11, 2^31 - 2^11). See [1]. > > [1] https://patchwork.kernel.org/project/linux-riscv/patch/20220131182145.236005-3-kernel at esmil.dk/ Ah, yes, thanks. I missed flipping sign and was thinking in semi-32-bit. So the maximum auipc is 0x80..0 which is -2^31, if we subtract form this we would overflow on 32-bit. But since we are subtracting from 64-bit register reach is actually larger than min int32. I'll fix! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18755#discussion_r1562614201 From aph at openjdk.org Fri Apr 12 14:38:17 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 14:38:17 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v13] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/1518b028..404d99a1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=11-12 Stats: 5 lines in 2 files changed: 3 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From snazarki at openjdk.org Fri Apr 12 14:44:52 2024 From: snazarki at openjdk.org (Sergey Nazarkin) Date: Fri, 12 Apr 2024 14:44:52 GMT Subject: RFR: 8330171: Lazy W^X swtich implementation Message-ID: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). Additional testing: - [x] MacOS AArch64 server fastdebug *gtets* - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* - [ ] Benchmarking @apangin and @parttimenerd could you please check the patch on your scenarios?? ------------- Commit messages: - Fix non-macos builds - Revert "8304725: AsyncGetCallTrace can cause SIGBUS on M1" - Remove ThreadWXEnable guard - Lazy W^X state switch Changes: https://git.openjdk.org/jdk/pull/18762/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18762&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330171 Stats: 325 lines in 46 files changed: 17 ins; 305 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18762.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18762/head:pull/18762 PR: https://git.openjdk.org/jdk/pull/18762 From vkempik at openjdk.org Fri Apr 12 14:53:44 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 12 Apr 2024 14:53:44 GMT Subject: RFR: 8330171: Lazy W^X swtich implementation In-Reply-To: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 12 Apr 2024 14:40:05 GMT, Sergey Nazarkin wrote: > An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). > > Additional testing: > - [x] MacOS AArch64 server fastdebug *gtets* > - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* > - [ ] Benchmarking > > @apangin and @parttimenerd could you please check the patch on your scenarios?? Hello Sergey. W^X mode was initially forced by Apple to prevent writing to executable memory, as a security feature. This change just eliminates this security feature at all, doesn't it ? Basically: "want to write to Executable memory ? ok, here you go" ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2051910824 From aph at openjdk.org Fri Apr 12 14:58:47 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 14:58:47 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v12] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 02:23:50 GMT, Vladimir Ivanov wrote: > Testing results (hs-tier1 - hs-tier6) are clean (w/ `-XX:-InlineSecondarySupersTest` and `-XX:+InlineSecondarySupersTest`). > > There's one build failure in GHA (minimal VM build on linux-x64) because InlineSecondarySupersTest is C2-only flag. Guarding the code with `#ifdef COMPILER2` should fix it. OK > Also, since the stubs are for compiler usage, any particular reason to generate them in `StubGenerator::generate_final_stubs()`? `StubGenerator::generate_compiler_stubs()` looks like a better fit for the job. At some point in the hopefully-not-very-distant future I'd like to kill secondary_super_cache altogether, and then everyone will use the stubs. But I can move the stubs for now, if you like? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2051923475 From yzheng at openjdk.org Fri Apr 12 15:06:50 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Fri, 12 Apr 2024 15:06:50 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode Message-ID: JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. ------------- Commit messages: - SharedRuntime::resolve* should respect interpreter-only mode Changes: https://git.openjdk.org/jdk/pull/18741/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18741&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330105 Stats: 47 lines in 2 files changed: 12 ins; 30 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18741.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18741/head:pull/18741 PR: https://git.openjdk.org/jdk/pull/18741 From never at openjdk.org Fri Apr 12 15:06:50 2024 From: never at openjdk.org (Tom Rodriguez) Date: Fri, 12 Apr 2024 15:06:50 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: References: Message-ID: <_HkhTaQEmzvQztxN07pV4ah0TerRpFEEe4VdAA8PZ7U=.2cdbd08c-6a48-44c6-a177-87701c1fbb91@github.com> On Thu, 11 Apr 2024 13:50:25 GMT, Yudi Zheng wrote: > JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. This looks good to me. Any path that can directly dispatch to compiled code must respect is_interp_only_mode(). ------------- Marked as reviewed by never (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18741#pullrequestreview-1994946752 From dlong at openjdk.org Fri Apr 12 15:06:50 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 12 Apr 2024 15:06:50 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 13:50:25 GMT, Yudi Zheng wrote: > JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. I agree, this is the right approach. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18741#pullrequestreview-1995021670 From yzheng at openjdk.org Fri Apr 12 15:06:51 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Fri, 12 Apr 2024 15:06:51 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 13:50:25 GMT, Yudi Zheng wrote: > JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. @dean-long we are seeing the same issue as JDK-8218403 . Graal thread blocks on Xcomp of an invokespecial to a random huge JDK method while interpreter-only mode is set. Could we simply use c2i entries in such cases? /cc @tkrodriguez Thanks for the review! I ran jvmti and jdi-related tests on C2 and Graal. Only failures are timeouts due to known bugs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18741#issuecomment-2049765962 PR Comment: https://git.openjdk.org/jdk/pull/18741#issuecomment-2051937479 From aph at openjdk.org Fri Apr 12 15:10:19 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 15:10:19 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: Message-ID: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/404d99a1..02b1837e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=12-13 Stats: 19 lines in 1 file changed: 10 ins; 9 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From syan at openjdk.org Fri Apr 12 15:27:52 2024 From: syan at openjdk.org (SendaoYan) Date: Fri, 12 Apr 2024 15:27:52 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v5] In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao SendaoYan has updated the pull request incrementally with one additional commit since the last revision: 1. if (isCgroupV1) only contains opts.addDockerOpts("--memory-swappiness=60"); 2. delete extra space at the beginning of the line in testSwapMemory ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18225/files - new: https://git.openjdk.org/jdk/pull/18225/files/bd144b73..4e5ea663 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=03-04 Stats: 40 lines in 1 file changed: 5 ins; 13 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/18225.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18225/head:pull/18225 PR: https://git.openjdk.org/jdk/pull/18225 From aph at openjdk.org Fri Apr 12 15:28:43 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 15:28:43 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: <88Jsd_RmZ8QTcODe6MsTx2j54J8Dk6dJX-ZUpVIdxVs=.abd71be6-dba9-4851-9f93-009858d0c175@github.com> On Fri, 12 Apr 2024 14:50:46 GMT, Vladimir Kempik wrote: > Hello Sergey. W^X mode was initially forced by Apple to prevent writing to executable memory, as a security feature. This change just eliminates this security feature at all, doesn't it ? Basically: "want to write to Executable memory ? ok, here you go" Yes @VladimirKempik, you are right. No, we should not do this. Instead, when we enter the VM we could track the current state of W^X and whenever we enter a block that needs to write into code space we would set W if needed. When we leave the VM or when we call back into Java we would set X, if needed. The cost of doing this would be small, but we'd have to find all the blocks that need to write into code space. This might be more effort than we want to make, though. So where would be need to make the transitions to W? At a guess, whenever we start assembling something, and in all of the methods in nativeInst_aarch64.?pp, and in class Patcher. And to X, in the call stub and a few other places. That would minimize the transitions exactly to the set of places we actually need. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2051977752 From aph at openjdk.org Fri Apr 12 15:30:46 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 15:30:46 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v12] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 14:56:28 GMT, Andrew Haley wrote: > But I can move the call to the code that generates the stubs for now, if you like? Done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2051980850 From vkempik at openjdk.org Fri Apr 12 15:32:41 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 12 Apr 2024 15:32:41 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 12 Apr 2024 14:40:05 GMT, Sergey Nazarkin wrote: > An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). > > Additional testing: > - [x] MacOS AArch64 server fastdebug *gtets* > - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* > - [ ] Benchmarking > > @apangin and @parttimenerd could you please check the patch on your scenarios?? This can be left as an addition to existing mechanism. Disabled by default and can be enabled with a special (DEVELOP) option. So this can't be enabled on production builds, but can be useful to debug w^x issues on debug builds ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2051984322 From syan at openjdk.org Fri Apr 12 15:32:42 2024 From: syan at openjdk.org (SendaoYan) Date: Fri, 12 Apr 2024 15:32:42 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v4] In-Reply-To: <7UiHS0UfgGBQs0w44kFgJKmGKbslpKyeALQxqN2MH1M=.ac447975-9ffb-4d19-8d02-3db27eb286ec@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> <7UiHS0UfgGBQs0w44kFgJKmGKbslpKyeALQxqN2MH1M=.ac447975-9ffb-4d19-8d02-3db27eb286ec@github.com> Message-ID: On Fri, 12 Apr 2024 10:14:33 GMT, Severin Gehwolf wrote: > Better. I don't think we need to duplicate that much of the test. All we'd need to do is to add the `--memory-swappiness` option only if we are on cg v1: > > ``` > opts.addDockerOpts("--memory=" + memValueToSet) > .addDockerOpts("--memory-swap=" + swapValueToSet) > .addClassOptions("jdk.SwapSpace")); > if (isCgroupV1) { > // With Cgroupv1, The default memory-swappiness vaule is inherited from the host machine, which maybe 0 > opts.addDockerOpts("--memory-swappiness=60"); > } > out = DockerTestUtils.dockerRunJava(opts); > ... > ``` Thank you very much for your very professional opinions. The modification of the review opinions has been completed. In addition, an extra space at the beginning of the line was deleted. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2051985003 From pchilanomate at openjdk.org Fri Apr 12 15:49:44 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 12 Apr 2024 15:49:44 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 09:35:22 GMT, Erik ?sterlund wrote: >> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. >> >> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. >> >> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >> >> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Nits Thanks, looks good to me. Only a few comments. src/hotspot/share/oops/stackChunkOop.inline.hpp line 135: > 133: > 134: inline bool stackChunkOopDesc::is_empty() const { > 135: assert(sp() <= stack_size(), ""); Maybe keep assert(sp() <= bottom(), "")? src/hotspot/share/runtime/continuationFreezeThaw.cpp line 567: > 565: // Consider leaving the chunk's argsize set when emptying it and removing the following branch, > 566: // although that would require changing stackChunkOopDesc::is_empty > 567: if (!chunk->is_empty()) { Seems you have implemented the suggestion in the comment so we can remove this branch and unconditionally decrement total_size_needed. src/hotspot/share/runtime/continuationFreezeThaw.cpp line 631: > 629: chunk->set_max_thawing_size(cont_size()); > 630: chunk->set_bottom(chunk_start_sp - _cont.argsize()); > 631: chunk->set_sp(chunk->bottom()); Do we need to set sp? We didn't do it before. src/hotspot/share/runtime/continuationFreezeThaw.cpp line 662: > 660: // They'll then be stored twice: in the chunk and in the parent chunk's top frame > 661: const int chunk_start_sp = cont_size() + frame::metadata_words; > 662: assert(chunk_start_sp == chunk->stack_size(), ""); Isn't this assert still valid? ------------- PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-1997313321 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562699073 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562687298 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562647851 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562654716 From shade at openjdk.org Fri Apr 12 16:07:41 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 12 Apr 2024 16:07:41 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: Message-ID: <17gipHM6B5g7uDlXUwE1lpgXSPKbkOeZAPd60uiEzgY=.c29f487e-1871-493e-9555-faea3c995068@github.com> On Fri, 12 Apr 2024 14:00:49 GMT, Andrew Haley wrote: > > Caught some failures, which made me think we want richer diagnostics around this. With new version, we print stuff like: > > ``` > > # Internal Error (/Users/shipilev/Work/shipilev-jdk/src/hotspot/share/opto/loopnode.cpp:2965), pid=32195, tid=27139 > > # Error: ABS: argument should not allow overflow > > ``` > > LOL, don't say you weren't warned! ;-) > > ``` > T res = (x < 0 && x != std::numeric_limits::min()) ? -x : x; > ``` I mean, we catch the proper error in some tests: https://bugs.openjdk.org/browse/JDK-8330158 Do we really need to do this `x != std::numeric_limits::min()` dance here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2052047842 From sgehwolf at openjdk.org Fri Apr 12 16:32:42 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 12 Apr 2024 16:32:42 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v5] In-Reply-To: <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> Message-ID: On Fri, 12 Apr 2024 15:27:52 GMT, SendaoYan wrote: >> Hi, >> >> According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. >> >> When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: >> >> 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` >> 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. >> 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. >> >> Maybe the 2rd method seems more resonable. >> >> >> Thanks, >> -sendao > > SendaoYan has updated the pull request incrementally with one additional commit since the last revision: > > 1. if (isCgroupV1) only contains opts.addDockerOpts("--memory-swappiness=60"); 2. delete extra space at the beginning of the line in testSwapMemory Looks OK to me. If you merge latest master, GHA should be clean too. ------------- Marked as reviewed by sgehwolf (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18225#pullrequestreview-1997885037 PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2052088389 From pchilanomate at openjdk.org Fri Apr 12 16:37:42 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 12 Apr 2024 16:37:42 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v3] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 12:00:55 GMT, Serguei Spitsyn wrote: >> The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove unneded check for is_vthread_alive; do not call do_thread from do_vthread Thanks Serguei, looks good to me. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18630#pullrequestreview-1997892115 From pchilanomate at openjdk.org Fri Apr 12 16:37:42 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 12 Apr 2024 16:37:42 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v2] In-Reply-To: <6hM-85WUal6tHr5lcHP021vbHFdWG-RAeJrRv-dxBw0=.914dc93d-bea2-40f0-b853-b5b8c8009c41@github.com> References: <10EU-jvOhZaur5uqCtnBJVodhqV8MKLzfI7IGBfo0cg=.348e71d1-0394-41fe-b511-3f3d7a35713c@github.com> <6hM-85WUal6tHr5lcHP021vbHFdWG-RAeJrRv-dxBw0=.914dc93d-bea2-40f0-b853-b5b8c8009c41@github.com> Message-ID: On Fri, 12 Apr 2024 01:22:04 GMT, Serguei Spitsyn wrote: >> Good question, thanks. >> The `JvmtiVTMSTransitionDisabler` is supposed to be installed in the caller's context if needed. >> However, it is not easy to make sure it is always the case. >> At least, I see a couple of contexts when the `JvmtiVTMSTransitionDisabler` is not being installed. >> But it is not clear if it is really needed there. Let me do some extra analysis there. > > Okay. The class `GetCurrentLocationClosure` is used by the `reset_current_location` only. It is called for the SINGLE_STEP and REAKPOINT event types as the following assert is placed at the function start: > > void JvmtiEnvThreadState::reset_current_location(jvmtiEvent event_type, bool enabled) { > assert(event_type == JVMTI_EVENT_SINGLE_STEP || event_type == JVMTI_EVENT_BREAKPOINT, > "must be single-step or breakpoint event"); > . . . > > Also, this is the only two places where this function is called: > > JvmtiEventControllerPrivate::recompute_env_thread_enabled(JvmtiEnvThreadState* ets, JvmtiThreadState* state) { > . . . > if (changed & SINGLE_STEP_BIT) { > ets->reset_current_location(JVMTI_EVENT_SINGLE_STEP, (now_enabled & SINGLE_STEP_BIT) != 0); > } > if (changed & BREAKPOINT_BIT) { > ets->reset_current_location(JVMTI_EVENT_BREAKPOINT, (now_enabled & BREAKPOINT_BIT) != 0); > } > > The `reset_current_location` is called called in the context of the `SetEventNotificationMode` where a JvmtiVTMSTransitionDisabler is present. > > Theoretically, it can be also triggered by the `SetEventCallbacks` (if callbacks are for SINGLE_STEP or REAKPOINT event type). But it also has a J`vmtiVTMSTransitionDisabler` in place: > > JvmtiEnv::SetEventCallbacks(const jvmtiEventCallbacks* callbacks, jint size_of_callbacks) { > JvmtiVTMSTransitionDisabler disabler; > JvmtiEventController::set_event_callbacks(this, callbacks, size_of_callbacks); > return JVMTI_ERROR_NONE; > } /* end SetEventCallbacks */ Thanks for the investigation! Maybe we should have an assert that current->is_VTMS_transition_disabler() here or even in the JvmtiHandshake::execute() that expects we have one in scope? I see that we have some conditions where JvmtiVTMSTransitionDisabler is a no-op though so we would have to include does as well. Or maybe set the boolean even when it is a no-op. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18630#discussion_r1562813538 From sgibbons at openjdk.org Fri Apr 12 16:47:58 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 12 Apr 2024 16:47:58 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v14] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Even more review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/970c5751..6e731c86 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=12-13 Stats: 35 lines in 4 files changed: 14 ins; 13 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sspitsyn at openjdk.org Fri Apr 12 17:07:43 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Apr 2024 17:07:43 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v3] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 12:00:55 GMT, Serguei Spitsyn wrote: >> The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove unneded check for is_vthread_alive; do not call do_thread from do_vthread > Thanks for the investigation! Maybe we should have an assert that current->is_VTMS_transition_disabler() here or even in the JvmtiHandshake::execute() that expects we have one in scope? I see that we have some conditions where JvmtiVTMSTransitionDisabler is a no-op though so we would have to include does as well. Or maybe set the boolean even when it is a no-op. I'm thinking about the same. It should be relatively easy to implement this check. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18630#issuecomment-2052144005 From duke at openjdk.org Fri Apr 12 17:12:52 2024 From: duke at openjdk.org (duke) Date: Fri, 12 Apr 2024 17:12:52 GMT Subject: Withdrawn: 8320750: Allow a testcase to run with multiple -Xlog In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 13:32:52 GMT, Leo Korinth wrote: > Running a testcase with muliple -Xlog crashes JTREG test cases. This is because `Collector.toMap` is not given a merge strategy. > > When the same argument is passed multiple times, I have added a merge strategy to use the latter value. This is similar to how it is implemented for `vm.opt.*` in JTREG. > > If the flag tested is `-Xlog`, replace the value part with a dummy value "NONEMPTY_TEST_SENTINEL". This is because in the case of multiple `-Xlog` all values are used, and JTREG does not give a satisfactory way to represent them. This dummy value should make it hard to try to `@require` on specific values by mistake. > > Tested with: > > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINEL" > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINELXXX" > @requires vm.opt.x.Xms == "3g" > > and > > JAVA_OPTIONS=-Xms3g -Xms4g > JAVA_OPTIONS=-Xms4g -Xms3g > JAVA_OPTIONS=-Xlog:gc* -Xlog:gc* > ``` > > Running tier1 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16824 From snazarki at openjdk.org Fri Apr 12 17:18:41 2024 From: snazarki at openjdk.org (Sergey Nazarkin) Date: Fri, 12 Apr 2024 17:18:41 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: <88Jsd_RmZ8QTcODe6MsTx2j54J8Dk6dJX-ZUpVIdxVs=.abd71be6-dba9-4851-9f93-009858d0c175@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> <88Jsd_RmZ8QTcODe6MsTx2j54J8Dk6dJX-ZUpVIdxVs=.abd71be6-dba9-4851-9f93-009858d0c175@github.com> Message-ID: On Fri, 12 Apr 2024 15:26:03 GMT, Andrew Haley wrote: >> Hello Sergey. >> W^X mode was initially forced by Apple to prevent writing to executable memory, as a security feature. >> This change just eliminates this security feature at all, doesn't it ? >> Basically: "want to write to Executable memory ? ok, here you go" > >> Hello Sergey. W^X mode was initially forced by Apple to prevent writing to executable memory, as a security feature. This change just eliminates this security feature at all, doesn't it ? Basically: "want to write to Executable memory ? ok, here you go" > > Yes @VladimirKempik, you are right. No, we should not do this. > > Instead, when we enter the VM we could track the current state of W^X and whenever we enter a block that needs to write into code space we would set W if needed. When we leave the VM or when we call back into Java we would set X, if needed. The cost of doing this would be small, but we'd have to find all the blocks that need to write into code space. This might be more effort than we want to make, though. > > So where would be need to make the transitions to W? At a guess, whenever we start assembling something, and in all of the methods in nativeInst_aarch64.?pp, and in class Patcher. And to X, in the call stub and a few other places. > > That would minimize the transitions exactly to the set of places we actually need. Thanks @theRealAph, @VladimirKempik > Instead, when we enter the VM we could track the current state of W^X and whenever we enter a block that needs to write into code space we would set W if needed. When we leave the VM or when we call back into Java we would set X, if needed. The cost of doing this would be small, but we'd have to find all the blocks that need to write into code space. This might be more effort than we want to make, though. ?It is the way in which it is implemented in the current code. Unfortunately, it is hardly maintainable solution that suffers from issues like [1-5]. As I understand it, your concern is that the implementation doesn't prevent rogue from writing to the code cache with some some unsafe api? 1. https://bugs.openjdk.org/browse/JDK-8302736 2. https://bugs.openjdk.org/browse/JDK-8327990 3. https://bugs.openjdk.org/browse/JDK-8327036 4. https://bugs.openjdk.org/browse/JDK-8304725 5. https://bugs.openjdk.org/browse/JDK-8307549 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2052164890 From vlivanov at openjdk.org Fri Apr 12 17:36:46 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 12 Apr 2024 17:36:46 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: <9wD-hw0Hg5at4WcTn4wPYChMdWe1FQhsaf9X2cIN5TQ=.12eb31c3-f35b-4c81-b911-d328c1899093@github.com> On Fri, 12 Apr 2024 15:10:19 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well >> At some point in the hopefully-not-very-distant future I'd like to kill secondary_super_cache altogether, and then everyone will use the stubs. But I can move the call to the code that generates the stubs for now, if you like? > Done. Thank you. As the code looks now, I have some doubts the stubs can be reused outside C2. First of all, they are targeted at constant superclass case and don't yet support non-constant/reflective case. Moreover, from previous experiments I learned that it's far from trivial to rely on stubs across the whole VM. Subtype checks are performed by template interpreter and other stubs (e.g., arraycopy), so the stubs they rely on should be ready by the time they are generated. And there's not much benefit from not inlining the whole slow path code for secondary super lookup there. C1 may benefit from the stubs, but slow path there is already represented as a stub. So, I don't expect much benefit from reusing C2 stubs there as well. Overall, as of now, it looks premature to optimize for future use cases. As part of work supporting remaining cases (template interpreter, stubs, C1, non-constant case in C2) it'll become clear what's the best representation. So, focusing on C2 use case and optimizing for it looks reasonable. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2052189683 From sspitsyn at openjdk.org Fri Apr 12 17:49:41 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 12 Apr 2024 17:49:41 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v3] In-Reply-To: References: Message-ID: <0hvY7WAjlccqf083w6WmAG4IU8giI0kSQJmRGvg7QjY=.70098480-43cf-45cf-9eee-a4a33776edeb@github.com> On Fri, 12 Apr 2024 12:00:55 GMT, Serguei Spitsyn wrote: >> The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: remove unneded check for is_vthread_alive; do not call do_thread from do_vthread Thank you for review, Patricio! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18630#issuecomment-2052207437 From kvn at openjdk.org Fri Apr 12 18:04:49 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 12 Apr 2024 18:04:49 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 15:10:19 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Few comments. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4432: > 4430: } > 4431: } > 4432: Can you move it up after `StubRoutines::_montgomerySquare` initialization? src/hotspot/cpu/x86/stubRoutines_x86.hpp line 41: > 39: // Windows have more code to save/restore registers > 40: _compiler_stubs_code_size = 20000 LP64_ONLY(+39000) WINDOWS_ONLY(+2000), > 41: _final_stubs_code_size = 10000 LP64_ONLY(+20000) WINDOWS_ONLY(+2000) ZGC_ONLY(+24000) Do we still need it after you moved code to compiler stubs section? src/hotspot/cpu/x86/vm_version_x86.cpp line 1786: > 1784: } > 1785: FLAG_SET_DEFAULT(UseSecondarySupersTable, false); > 1786: } No need this with other changes I suggest. src/hotspot/cpu/x86/vm_version_x86.hpp line 793: > 791: // x86_64 supports secondary supers table > 792: constexpr static bool supports_secondary_supers_table() { > 793: return LP64_ONLY(true) NOT_LP64(false); // not implemented on x86_32 I think it should be: return LP64_ONLY(supports_popcnt()) NOT_LP64(false); // not implemented on x86_32 and no need changes in `vm_version_x86.cpp`. The main check will be done in `arguments.cpp` src/hotspot/share/cds/filemap.hpp line 274: > 272: bool compressed_oops() const { return _compressed_oops; } > 273: bool compressed_class_pointers() const { return _compressed_class_ptrs; } > 274: bool use_secondary_supers_table() const { return _use_secondary_supers_table; } Do we really need this accessor which is used only in one place? ------------- PR Review: https://git.openjdk.org/jdk/pull/18309#pullrequestreview-1998112521 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562961409 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562966870 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562945973 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562945087 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1562954617 From azafari at openjdk.org Fri Apr 12 19:10:09 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 12 Apr 2024 19:10:09 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v5] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: mtCode and mtMetaspace were missed from System Dump map ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/9d66735f..a3940639 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=03-04 Stats: 11 lines in 3 files changed: 2 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From dlong at openjdk.org Fri Apr 12 19:45:43 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 12 Apr 2024 19:45:43 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v4] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 10:05:55 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - More straightforward > - Richer error reporting src/hotspot/share/utilities/globalDefinitions.hpp line 1119: > 1117: T res = (x > 0) ? x : -x; > 1118: #ifdef ASSERT > 1119: if (res < 0) { I don't see how we could ever hit this. Checking input seems sufficient. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1563102819 From dcubed at openjdk.org Fri Apr 12 20:18:50 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 Apr 2024 20:18:50 GMT Subject: RFR: 8329757: Crash with fatal error: DEBUG MESSAGE: Fast Unlock lock on stack [v4] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 18:22:08 GMT, Axel Boldt-Christmas wrote: >> `Deoptimization::relock_objects` may reorder locks within in the `LockStack` which are added inside the same vframe. This can be handled by the interpreter but if OSR has occurred C2 may observe this invalid order in the `LockStack`, which breaks its assumption leading to incorrect behaviour. >> >> This patch functionally makes sure that the LockStack is always consistent by always inflating eliminated locks when `Deoptimization::relock_objects` is called. >> >> It also adds verification code which checks that the LockStack is consistent with the lock order observed inside the deoptimized vframes. >> >> Note: for leaf deoptimizations we have enough information to recreate a correct top of the LockStack with minimal inflations, however that should be a separate RFE. This only inflates eliminated locks so the worth of solving that may be minimal or even detrimental. >> >> Tests still running. Tier 1-7 done. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Change to ASSERT I'm sorry that I wasn't able to finish this review before your integration. I started, paused and resumed a couple of times trying to get my head wrapped around the deopt code. If the review comments seem disjointed, I apologize. I only have nits here. I've read the test program a couple of more times and I still don't grok how the submitter got to that test program's logic flow. src/hotspot/share/runtime/deoptimization.cpp line 1658: > 1656: } > 1657: } > 1658: if (LockingMode == LM_LIGHTWEIGHT) { I like that the core of the fix is just this line. src/hotspot/share/runtime/lockStack.cpp line 107: > 105: > 106: #ifdef ASSERT > 107: void LockStack::verify_consistent_lock_order(GrowableArray& lock_order, bool leaf_frame) const { Thanks for adding the verification code. I spun my head around a bit with `exec_mode != Deoptimization::Unpack_none` being passed as `leaf_frame`, but I think I got myself reoriented on the correct meaning of `if (!leaf_frame) {`. src/hotspot/share/runtime/lockStack.cpp line 127: > 125: > 126: if (VM_Version::supports_recursive_lightweight_locking()) { > 127: // With recursive looks there may be more of the same object nit typo: s/looks/locks/ test/hotspot/jtreg/compiler/escapeAnalysis/Test8329757.java line 28: > 26: * @bug 8329757 > 27: * @summary Deoptimization with nested eliminated and not eliminated locks > 28: * caused reordered lock stacks. This can be handled by the interpreter nit typo: s/caused/causing/ test/hotspot/jtreg/compiler/escapeAnalysis/Test8329757.java line 42: > 40: > 41: int a = 400; > 42: Double ddd; This variable isn't used, but would the bug repro if it isn't there? test/hotspot/jtreg/compiler/escapeAnalysis/Test8329757.java line 60: > 58: } while (--e > 0); > 59: } > 60: } Indents are bit off here: - L47 -> L60 should be indented by an additional four spaces to get an indent for the first `synchronized` - L50 -> L58 should be indented by an additional four spaces to get an indent for the for-loop I've never seen a "do switch" formatted like this before and the switch statement's case code needs some indenting and a `// fall-thru` comment after L55. The do-while is missing '{' and '}' and there should be some formatting of that switch statement. test/hotspot/jtreg/compiler/escapeAnalysis/Test8329757.java line 65: > 63: > 64: void n() { > 65: for (int j = 6; j < 274; ++j) q(); The `q();` should be on a line by itself and there should be '{' and '}' for the for-loop. test/hotspot/jtreg/compiler/escapeAnalysis/Test8329757.java line 70: > 68: public static void main(String[] args) { > 69: Test8329757 r = new Test8329757(); > 70: for (int i = 0; i < 1000; i++) r.n(); The `r.n();` should be on a line by itself and there should be '{' and '}' for the for-loop. ------------- PR Review: https://git.openjdk.org/jdk/pull/18715#pullrequestreview-1998306513 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563123416 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563136180 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563098171 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563137396 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563121460 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563117744 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563119677 PR Review Comment: https://git.openjdk.org/jdk/pull/18715#discussion_r1563119187 From aph at openjdk.org Fri Apr 12 22:11:46 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 12 Apr 2024 22:11:46 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: <17gipHM6B5g7uDlXUwE1lpgXSPKbkOeZAPd60uiEzgY=.c29f487e-1871-493e-9555-faea3c995068@github.com> References: <17gipHM6B5g7uDlXUwE1lpgXSPKbkOeZAPd60uiEzgY=.c29f487e-1871-493e-9555-faea3c995068@github.com> Message-ID: On Fri, 12 Apr 2024 16:05:10 GMT, Aleksey Shipilev wrote: > > ``` > > T res = (x < 0 && x != std::numeric_limits::min()) ? -x : x; > > ``` > > I mean, we catch the proper error in some tests: https://bugs.openjdk.org/browse/JDK-8330158 Do we really need to do this `x != std::numeric_limits::min()` dance here? I think so. Several of us have worked on eliminating undefined behaviour in HotSpot and we've made good progress. I think it would be sad for new UB to be pushed now, especially in a case like this when it wouldn't be accidental. UB is just something we have to deal with, because C++. :-( ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2052623059 From dlong at openjdk.org Fri Apr 12 22:29:50 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 12 Apr 2024 22:29:50 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 15:10:19 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4789: > 4787: // Get the first array index that can contain super_klass into r_array_index. > 4788: if (bit != 0) { > 4789: popcntq(r_array_index, r_array_index); What about hardware with supports_popcnt() == false? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563286777 From vlivanov at openjdk.org Fri Apr 12 22:33:49 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 12 Apr 2024 22:33:49 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 22:27:17 GMT, Dean Long wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4789: > >> 4787: // Get the first array index that can contain super_klass into r_array_index. >> 4788: if (bit != 0) { >> 4789: popcntq(r_array_index, r_array_index); > > What about hardware with supports_popcnt() == false? `VM_Version::supports_popcnt()` is a prerequisite for `UseSecondarySupersTable`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563290039 From kvn at openjdk.org Fri Apr 12 22:48:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 12 Apr 2024 22:48:01 GMT Subject: RFR: 8329433: Reduce nmethod header size Message-ID: This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. These changes reduced size of `nmethod` header from 288 to 240 bytes. From 304 to 256 in optimized VM: Statistics for 1282 bytecoded nmethods for C2: total in heap = 5560352 (100%) header = 389728 (7.009053%) vs Statistics for 1298 bytecoded nmethods for C2: total in heap = 5766040 (100%) header = 332288 (5.762846%) Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. I did additional cleanup after recent `CompiledMethod` removal. Tested tier1-7,stress,xcomp and performance testing. ------------- Commit messages: - 8329433: Reduce nmethod header size Changes: https://git.openjdk.org/jdk/pull/18768/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329433 Stats: 386 lines in 15 files changed: 92 ins; 115 del; 179 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From kvn at openjdk.org Fri Apr 12 23:19:47 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 12 Apr 2024 23:19:47 GMT Subject: RFR: 8329433: Reduce nmethod header size In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 22:43:15 GMT, Vladimir Kozlov wrote: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 240 bytes. From 304 to 256 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1298 bytecoded nmethods for C2: > total in heap = 5766040 (100%) > header = 332288 (5.762846%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. src/hotspot/share/code/codeBlob.cpp line 78: > 76: #ifdef ASSERT > 77: void CodeBlob::verify_parameters() { > 78: assert(is_aligned(_size, oopSize), "unaligned size"); Asserts moved to only caller. src/hotspot/share/code/codeBlob.cpp line 92: > 90: CodeBlob::CodeBlob(const char* name, CodeBlobKind kind, int size, int header_size, int relocation_size, > 91: int content_offset, int code_offset, int frame_complete_offset, int data_offset, > 92: int frame_size, ImmutableOopMapSet* oop_maps, bool caller_must_gc_arguments) : This was used only by `CompiledMethod` class. src/hotspot/share/code/codeBlob.cpp line 129: > 127: } > 128: > 129: void CodeBlob::purge() { Arguments are not used. src/hotspot/share/code/codeBlob.hpp line 228: > 226: const ImmutableOopMap* oop_map_for_slot(int slot, address return_address) const; > 227: const ImmutableOopMap* oop_map_for_return_address(address return_address) const; > 228: virtual void preserve_callee_argument_oops(frame fr, const RegisterMap* reg_map, OopClosure* f) = 0; This method is not empty only in `nmethod`. Converted to normal method there and added check` cb->is_nmethod()` in call sites. src/hotspot/share/code/debugInfoRec.cpp line 251: > 249: void print() { > 250: tty->print_cr("Debug Data Chunks: %d, shared %d, non-SP's elided %d", > 251: chunks_queried, chunks_shared, chunks_elided); `chunks_reshared` is not used. src/hotspot/share/code/dependencies.cpp line 391: > 389: address end = nm->dependencies_end(); > 390: guarantee(end - beg >= (ptrdiff_t) size_in_bytes(), "bad sizing"); > 391: (void)memcpy(beg, content_bytes(), size_in_bytes()); To avoid false error reported by `GCC` only during product VM build for `linux-x64`: inlined from 'void Dependencies::copy_to(nmethod*)' at src/hotspot/share/code/dependencies.cpp:391:23: src/hotspot/cpu/x86/copy_x86.hpp:110:18: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 110 | case 8: to[7] = from[7]; | ~~~~~~^~~~~~~~~ src/hotspot/share/code/nmethod.cpp line 142: > 140: uint size_gt_32k; > 141: int size_max; > 142: Diagnostic code leftover from my work on [#18554](https://github.com/openjdk/jdk/pull/18554) src/hotspot/share/code/nmethod.cpp line 1486: > 1484: } > 1485: #endif > 1486: dependencies->copy_to(this); Missed to revert back this change. src/hotspot/share/code/nmethod.hpp line 208: > 206: address _osr_entry_point; // entry point for on stack replacement > 207: uint16_t _entry_offset; // entry point with class check > 208: uint16_t _verified_entry_offset; // entry point without class check Changed direct entry pointers (8 bytes each in 64-bit VM) to offsets (2 bytes) to `code_begin()`. src/hotspot/share/code/nmethod.hpp line 234: > 232: int _scopes_data_offset; > 233: int _handler_table_offset; > 234: int _nul_chk_table_offset; Changed data sections offsets base from `header_begin()` to `data_begin()`. Note, `ScopesPcs` and `ScopesData` data could be > 64Kb. src/hotspot/share/code/nmethod.hpp line 237: > 235: #if INCLUDE_JVMCI > 236: int _speculations_offset; > 237: int _jvmci_data_offset; "Wasted space" when Graal is not used. May be address in future changes. src/hotspot/share/code/nmethod.hpp line 274: > 272: // used by jvmti to track if an event has been posted for this nmethod. > 273: bool _load_reported; > 274: Converted to bit mask. src/hotspot/share/code/nmethod.hpp line 724: > 722: ExceptionCache* exception_cache() const { return _exception_cache; } > 723: ExceptionCache* exception_cache_acquire() const; > 724: void set_exception_cache(ExceptionCache *ec) { _exception_cache = ec; } Not used. src/hotspot/share/code/nmethod.hpp line 800: > 798: > 799: // Deallocate this nmethod - called by the GC > 800: void purge(bool unregister_nmethod); Only `unregister_nmethod` is used by code. src/hotspot/share/memory/heap.hpp line 42: > 40: struct Header { > 41: uint32_t _length; // the length in segments > 42: bool _used; // Used bit The `Header` size is aligned up 8 bytes. With `size_t length;` the aligned size will be 16 bytes. `HeapBlock/Header` is used by CodeCache only which has 2Gb limit. The `length` counts number of segments which smallest size (`CodeCacheSegmentSize`) is 64 bytes. 32 bits is enough to cover it. src/hotspot/share/runtime/frame.cpp line 979: > 977: if (reg_map->include_argument_oops() && _cb->is_nmethod()) { > 978: // Only nmethod preserves outgoing arguments at call. > 979: _cb->as_nmethod()->preserve_callee_argument_oops(*this, reg_map, f); Only `nmethod` preserves arguments oops. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563303217 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563302110 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563303989 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563307999 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563310748 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563311696 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563313116 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563334063 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563317344 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563320172 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563321008 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563321437 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563321900 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563322669 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563331968 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1563332602 From dlong at openjdk.org Fri Apr 12 23:32:44 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 12 Apr 2024 23:32:44 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 15:10:19 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well src/hotspot/share/oops/klass.cpp line 459: > 457: } > 458: > 459: uint8_t Klass::compute_home_slot(Klass* k, uintx bitmap) { This could use some comments, because it's not doing what I would expect, if this should match the asm code. Here, we never call population_count() on the full bitmap (shift 0), unlike the asm code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563347745 From dlong at openjdk.org Fri Apr 12 23:47:44 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 12 Apr 2024 23:47:44 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: <0HtjTUCSuwUp9JkbaZkuPkO0J42aNmgjRUbPdE2q8g0=.461fc855-3a11-4f2e-ad95-ef879d14d6ab@github.com> On Fri, 12 Apr 2024 15:10:19 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well src/hotspot/share/opto/matcher.cpp line 2503: > 2501: break; > 2502: } > 2503: case Op_PartialSubtypeCheck: { This could use a comment. Changing the shape with BinaryNode is standard, but duplicating inputs seems to be something new. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563364143 From vlivanov at openjdk.org Fri Apr 12 23:52:45 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 12 Apr 2024 23:52:45 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 23:30:29 GMT, Dean Long wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/share/oops/klass.cpp line 459: > >> 457: } >> 458: >> 459: uint8_t Klass::compute_home_slot(Klass* k, uintx bitmap) { > > This could use some comments, because it's not doing what I would expect, if this should match the asm code. Here, we never call population_count() on the full bitmap (shift 0), unlike the asm code. Neither do we do that in assembly code [1] [2]. `Klass::compute_home_slot` is equivalent to what happens in MacroAssembler (modulo some micro-optimizations). [1] macroAssembler_x86.cpp: ``` ... int shift_count = Klass::SECONDARY_SUPERS_TABLE_MASK - bit; if (shift_count != 0) { salq(r_array_index, shift_count); } else { testq(r_array_index, r_array_index); } ... if (bit != 0) { popcntq(r_array_index, r_array_index); [2] macroAssembler_aarch64.cpp: if (bit != 0) { shld(vtemp, vtemp, Klass::SECONDARY_SUPERS_TABLE_MASK - bit); cnt(vtemp, T8B, vtemp); addv(vtemp, T8B, vtemp); fmovd(r_array_index, vtemp); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563371575 From vlivanov at openjdk.org Fri Apr 12 23:52:45 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Fri, 12 Apr 2024 23:52:45 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: <7VH3UYl5JtIZInIXIPFrcrTkkKR0igTgelOoBndGgio=.da58ef3e-5494-45c3-aa2a-856d00a705d9@github.com> On Fri, 12 Apr 2024 23:49:16 GMT, Vladimir Ivanov wrote: >> src/hotspot/share/oops/klass.cpp line 459: >> >>> 457: } >>> 458: >>> 459: uint8_t Klass::compute_home_slot(Klass* k, uintx bitmap) { >> >> This could use some comments, because it's not doing what I would expect, if this should match the asm code. Here, we never call population_count() on the full bitmap (shift 0), unlike the asm code. > > Neither do we do that in assembly code [1] [2]. `Klass::compute_home_slot` is equivalent to what happens in MacroAssembler (modulo some micro-optimizations). > > > [1] macroAssembler_x86.cpp: > ``` > ... > int shift_count = Klass::SECONDARY_SUPERS_TABLE_MASK - bit; > if (shift_count != 0) { > salq(r_array_index, shift_count); > } else { > testq(r_array_index, r_array_index); > } > ... > if (bit != 0) { > popcntq(r_array_index, r_array_index); > > > [2] macroAssembler_aarch64.cpp: > > if (bit != 0) { > shld(vtemp, vtemp, Klass::SECONDARY_SUPERS_TABLE_MASK - bit); > cnt(vtemp, T8B, vtemp); > addv(vtemp, T8B, vtemp); > fmovd(r_array_index, vtemp); Are you asking about `if (hash > 0) {` check? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563372942 From vlivanov at openjdk.org Sat Apr 13 00:01:44 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Sat, 13 Apr 2024 00:01:44 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <0HtjTUCSuwUp9JkbaZkuPkO0J42aNmgjRUbPdE2q8g0=.461fc855-3a11-4f2e-ad95-ef879d14d6ab@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> <0HtjTUCSuwUp9JkbaZkuPkO0J42aNmgjRUbPdE2q8g0=.461fc855-3a11-4f2e-ad95-ef879d14d6ab@github.com> Message-ID: On Fri, 12 Apr 2024 23:44:38 GMT, Dean Long wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/share/opto/matcher.cpp line 2503: > >> 2501: break; >> 2502: } >> 2503: case Op_PartialSubtypeCheck: { > > This could use a comment. Changing the shape with BinaryNode is standard, but duplicating inputs seems to be something new. Good point. Does it look better? if (UseSecondarySupersTable && n->in(2)->is_Con()) { // PartialSubtypeCheck uses both constant and register operands for superclass input. n->set_req(2, new BinaryNode(n->in(2), n->in(2))); break; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563383350 From dlong at openjdk.org Sat Apr 13 00:19:43 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 13 Apr 2024 00:19:43 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: <7VH3UYl5JtIZInIXIPFrcrTkkKR0igTgelOoBndGgio=.da58ef3e-5494-45c3-aa2a-856d00a705d9@github.com> References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> <7VH3UYl5JtIZInIXIPFrcrTkkKR0igTgelOoBndGgio=.da58ef3e-5494-45c3-aa2a-856d00a705d9@github.com> Message-ID: On Fri, 12 Apr 2024 23:50:26 GMT, Vladimir Ivanov wrote: >> Neither do we do that in assembly code [1] [2]. `Klass::compute_home_slot` is equivalent to what happens in MacroAssembler (modulo some micro-optimizations). >> >> >> [1] macroAssembler_x86.cpp: >> ``` >> ... >> int shift_count = Klass::SECONDARY_SUPERS_TABLE_MASK - bit; >> if (shift_count != 0) { >> salq(r_array_index, shift_count); >> } else { >> testq(r_array_index, r_array_index); >> } >> ... >> if (bit != 0) { >> popcntq(r_array_index, r_array_index); >> >> >> [2] macroAssembler_aarch64.cpp: >> >> if (bit != 0) { >> shld(vtemp, vtemp, Klass::SECONDARY_SUPERS_TABLE_MASK - bit); >> cnt(vtemp, T8B, vtemp); >> addv(vtemp, T8B, vtemp); >> fmovd(r_array_index, vtemp); > > Are you asking about `if (hash > 0) {` check? No, in the asm code, when bit == SECONDARY_SUPERS_TABLE_MASK (MSB), we do popcnt on the full bitmap. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563410918 From vlivanov at openjdk.org Sat Apr 13 00:53:43 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Sat, 13 Apr 2024 00:53:43 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> <7VH3UYl5JtIZInIXIPFrcrTkkKR0igTgelOoBndGgio=.da58ef3e-5494-45c3-aa2a-856d00a705d9@github.com> Message-ID: On Sat, 13 Apr 2024 00:17:13 GMT, Dean Long wrote: >> Are you asking about `if (hash > 0) {` check? > > No, in the asm code, when bit == SECONDARY_SUPERS_TABLE_MASK (MSB), we do popcnt on the full bitmap. That's one of micro-optimizations performed in assembly code. What's actually being computed is `home_index + 1`, but `r_array_base` is one word (-1) off, so first probe lands at `secondary_supers[home_index]`. It saves an instruction, because you don't need to increment `r_array_index`. `r_array_base` adjustment [2] effectively increments the index as well. There are some comments on assembly side to make it clear [1]. I was initially confused by all those small tweaks happening in assembly as well, so added more comments. Maybe we need more. `Klass::compute_home_slot()` is a canonical implementation to compute home slot. Comments there about what happens on assembly side would look to me irrelevant. [1] // NB! r_array_index is off by 1. It is compensated by keeping r_array_base off by 1 word. [2] // And adjust the array base to point to the data. // NB! Effectively increments current slot index by 1. assert(Array::base_offset_in_bytes() == wordSize, ""); add(r_array_base, r_array_base, Array::base_offset_in_bytes()); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563441962 From syan at openjdk.org Sat Apr 13 03:01:46 2024 From: syan at openjdk.org (SendaoYan) Date: Sat, 13 Apr 2024 03:01:46 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v5] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> Message-ID: On Fri, 12 Apr 2024 16:30:31 GMT, Severin Gehwolf wrote: > If you merge latest master, GHA should be clean too. The GHA shows java/lang/String/StringRepeat.java#id1 fails on linux x86. It's a testcase bug which has been fixed in [8328524](https://github.com/openjdk/jdk/pull/18380). This failure is unrelated to this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2053097931 From stuefe at openjdk.org Sat Apr 13 05:41:42 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 13 Apr 2024 05:41:42 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v5] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 12 Apr 2024 19:10:09 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > mtCode and mtMetaspace were missed from System Dump map Just a thought: one (manual) test I would do would be that several JVMs run with the same conditions (I would do at least one with -Xmx=Xms and AlwaysPreTouch) accumulate the same NMT numbers, current, and peak. Just to make sure we use the same flags before and after. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2053511560 From sspitsyn at openjdk.org Sat Apr 13 06:57:05 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 13 Apr 2024 06:57:05 GMT Subject: RFR: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake [v4] In-Reply-To: References: Message-ID: > The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: add an assert for missing JvmtiVTMSTransitionDisabler ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18630/files - new: https://git.openjdk.org/jdk/pull/18630/files/86775376..68687dca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18630&range=02-03 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18630.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18630/head:pull/18630 PR: https://git.openjdk.org/jdk/pull/18630 From dlong at openjdk.org Sat Apr 13 09:14:46 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 13 Apr 2024 09:14:46 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> <7VH3UYl5JtIZInIXIPFrcrTkkKR0igTgelOoBndGgio=.da58ef3e-5494-45c3-aa2a-856d00a705d9@github.com> Message-ID: On Sat, 13 Apr 2024 00:50:27 GMT, Vladimir Ivanov wrote: >> No, in the asm code, when bit == SECONDARY_SUPERS_TABLE_MASK (MSB), we do popcnt on the full bitmap. > > That's one of micro-optimizations performed in assembly code. > What's actually being computed is `home_index + 1`, but `r_array_base` is one word (-1) off, so first probe lands at `secondary_supers[home_index]`. It saves an instruction, because you don't need to increment `r_array_index`. `r_array_base` adjustment [2] effectively increments the index as well. > > There are some comments on assembly side to make it clear [1]. I was initially confused by all those small tweaks happening in assembly as well, so added more comments. Maybe we need more. > > `Klass::compute_home_slot()` is a canonical implementation to compute home slot. Comments there about what happens on assembly side would look to me irrelevant. > > [1] > > // NB! r_array_index is off by 1. It is compensated by keeping r_array_base off by 1 word. > > > [2] > > // And adjust the array base to point to the data. > // NB! Effectively increments current slot index by 1. > assert(Array::base_offset_in_bytes() == wordSize, ""); > add(r_array_base, r_array_base, Array::base_offset_in_bytes()); OK, I can see now that they both compute the same thing because the "home" bit must be 1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1563889570 From sspitsyn at openjdk.org Sat Apr 13 09:24:45 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 13 Apr 2024 09:24:45 GMT Subject: Integrated: 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 15:28:41 GMT, Serguei Spitsyn wrote: > The internal JVM TI JvmtiHandshake and JvmtiUnitedHandshakeClosure classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor the JVM TI internal functions JvmtiEnvThreadState::reset_current_location on the base of JvmtiHandshake and JvmtiUnitedHandshakeClosure classes. > > Testing: > - Ran mach5 tiers 1-6 This pull request has now been integrated. Changeset: c1c99a66 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/c1c99a669bb7f9928086db6a4ecfc90c410ffbb0 Stats: 101 lines in 3 files changed: 28 ins; 67 del; 6 mod 8329674: JvmtiEnvThreadState::reset_current_location function should use JvmtiHandshake Reviewed-by: lmesnik, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/18630 From aph-open at littlepinkcloud.com Sat Apr 13 12:35:44 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Sat, 13 Apr 2024 13:35:44 +0100 Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> <88Jsd_RmZ8QTcODe6MsTx2j54J8Dk6dJX-ZUpVIdxVs=.abd71be6-dba9-4851-9f93-009858d0c175@github.com> Message-ID: On 4/12/24 18:18, Sergey Nazarkin wrote: > ?It is the way in which it is implemented in the current code. No, it's not. That's not what we do at all. We don't set W^X when we need it: instead, we set it at certain times in the hope that it'll be needed. I'm suggesting we should set W^X *exactly* where we need it, such as at patching methods. Not at VM entry. Get rid of all the assert_wx_state. If deoptimization needs WXWrite, then it should set it, not hope for someone else to have done it. From stuefe at openjdk.org Sat Apr 13 18:18:41 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 13 Apr 2024 18:18:41 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 12 Apr 2024 14:40:05 GMT, Sergey Nazarkin wrote: > An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). > > Additional testing: > - [x] MacOS AArch64 server fastdebug *gtets* > - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* > - [ ] Benchmarking > > @apangin and @parttimenerd could you please check the patch on your scenarios?? I have one question, and I'm sorry if it has been answered before. How expensive is changing the mode? Is it just a status variable in user-space pthread lib? Or does it need a system call? In other words, how fine granular can we get without incurring too high a cost? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2053721713 From aph at openjdk.org Sat Apr 13 19:09:46 2024 From: aph at openjdk.org (Andrew Haley) Date: Sat, 13 Apr 2024 19:09:46 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: <64tQxEFk-VGE424wdgtQgzEQnj9R5POxrLCKyGEpEGw=.7f40b04f-fdb8-4485-b530-1841384dbc8a@github.com> On Sat, 13 Apr 2024 18:16:21 GMT, Thomas Stuefe wrote: > I have one question, and I'm sorry if it has been answered before. How expensive is changing the mode? Is it just a status variable in user-space pthread lib? Or does it need a system call? > > In other words, how fine granular can we get without incurring too high a cost? It's expensive. We've seen it cause significant slowdowns in Java->VM transitions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2053734174 From kim.barrett at oracle.com Sat Apr 13 19:28:19 2024 From: kim.barrett at oracle.com (Kim Barrett) Date: Sat, 13 Apr 2024 19:28:19 +0000 Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> <88Jsd_RmZ8QTcODe6MsTx2j54J8Dk6dJX-ZUpVIdxVs=.abd71be6-dba9-4851-9f93-009858d0c175@github.com> Message-ID: <43C19AD1-AB0C-46C4-898F-1AEEA2A50A91@oracle.com> > On Apr 13, 2024, at 8:35 AM, Andrew Haley wrote: > > On 4/12/24 18:18, Sergey Nazarkin wrote: >> ?It is the way in which it is implemented in the current code. > > No, it's not. That's not what we do at all. > > We don't set W^X when we need it: instead, we set it at certain times in the hope > that it'll be needed. I'm suggesting we should set W^X *exactly* where we need it, > such as at patching methods. Not at VM entry. > > Get rid of all the assert_wx_state. If deoptimization needs WXWrite, then it should > set it, not hope for someone else to have done it. There was a recent internal-to-Oracle discussion about W^X, which led to something that I think is along the lines of what @aph is suggesting, including a prototype. I will poke some of the folks who were more deeply involved in that discussion (I was just casually following along, and am not knowledgeable in this area), but it's a weekend now, so it might take them some time to respond. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From dnsimon at openjdk.org Sat Apr 13 21:40:43 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 13 Apr 2024 21:40:43 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v3] In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 16:25:04 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request incrementally with one additional commit since the last revision: > > Comment updates Can this be merged soon Tom? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17714#issuecomment-2053767374 From john.r.rose at oracle.com Sun Apr 14 07:34:39 2024 From: john.r.rose at oracle.com (John Rose) Date: Sun, 14 Apr 2024 00:34:39 -0700 Subject: RFR: 8329728: Read arbitrarily long lines in ClassListParser In-Reply-To: References: Message-ID: <189466C2-6E4E-49F1-8B6A-914F13554A51@oracle.com> I have been developing something on the back burner for robust text input. I got tired of looking at fixed-sized buffers for the compiler oracle and CDS, and now that CDS is scaling up in its demands for complex configuration files, I agree we need something better. Here?s my draft work: https://github.com/openjdk/jdk/pull/18773 Please consider adopting it as a foundation for better CDS configuration reading. It?s old work, based to the repo as of about a year ago. But it should rebase easily. Thanks, ? John P.S. As a next layer I?d like to make something like sscanf, except that it operates on the inputStream, like print_cr does formatted output on outputStream, but in the reverse direction. But for starters we need a solid foundation for flexible line-oriented input. On 7 Apr 2024, at 21:56, Ioi Lam wrote: > Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). > > I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. > > ------------- > > Commit messages: > - 8329728: Read arbitrarily long lines in ClassListParser > > Changes: https://git.openjdk.org/jdk/pull/18669/files > Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=00 > Issue: https://bugs.openjdk.org/browse/JDK-8329728 > Stats: 235 lines in 5 files changed: 193 ins; 18 del; 24 mod > Patch: https://git.openjdk.org/jdk/pull/18669.diff > Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 > > PR: https://git.openjdk.org/jdk/pull/18669 From kvn at openjdk.org Mon Apr 15 00:49:11 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 15 Apr 2024 00:49:11 GMT Subject: RFR: 8329433: Reduce nmethod header size [v2] In-Reply-To: References: Message-ID: <5nXtplnwZ4JwdZyDeHo429kdcWRP3IfzZcxBrQEwFuU=.56eabc9c-c60f-4453-bb33-2bbb85b4fc7f@github.com> > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 240 bytes. From 304 to 256 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1298 bytecoded nmethods for C2: > total in heap = 5766040 (100%) > header = 332288 (5.762846%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Moved some fields initialization into init_defaults() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18768/files - new: https://git.openjdk.org/jdk/pull/18768/files/a0c46b86..488f1b92 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=00-01 Stats: 31 lines in 1 file changed: 10 ins; 17 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From fyang at openjdk.org Mon Apr 15 01:23:49 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 01:23:49 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 13:03:27 GMT, Robbin Ehn wrote: > Hi please consider! > > jal do not have C switch, we always use the full length instructions. > But jalr have, in case of an unbound Label which is to far for jal we can emit c_jalr. > When we bind the Label we can't patch the c_jalr. > > Sanity tested. src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 926: > 924: assert_different_registers(Rd, temp); \ > 925: /* We can't patch C, i.e. if Label wasn't bound we need to patch this jump.*/ \ > 926: IncompressibleRegion ir(this); \ Seems that we have rare uses of `jal` with unbound label. Here is what I see it. For unbound label, we set `dest` to pc() at [1]. So the `distance` calculated at [2] will always be zero for this case. After that we emit a simple `Assembler::jal(Rd, 0);` at [3] which is there for patching. So no `jalr` involved. I think this means that possible future callers of `jal` with unbound label have to ensure that the target is not far. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L947 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L913 [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L915 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565034635 From ioi.lam at oracle.com Mon Apr 15 03:19:06 2024 From: ioi.lam at oracle.com (ioi.lam at oracle.com) Date: Sun, 14 Apr 2024 20:19:06 -0700 Subject: RFR: 8329728: Read arbitrarily long lines in ClassListParser In-Reply-To: <189466C2-6E4E-49F1-8B6A-914F13554A51@oracle.com> References: <189466C2-6E4E-49F1-8B6A-914F13554A51@oracle.com> Message-ID: <5da8299e-5031-46f8-9fea-99b4609409ac@oracle.com> Hi John, Thanks for posting the code. Let me try to rebase your code onto mainline, and then apply my ClassListParser changes on top of that. I will probably open a new PR that combines this PR (#18669) and yours. Let see how that looks and we can decide how to proceed. Thanks - Ioi On 4/14/24 12:34 AM, John Rose wrote: > I have been developing something on the back burner for robust > text input. I got tired of looking at fixed-sized buffers for > the compiler oracle and CDS, and now that CDS is scaling up > in its demands for complex configuration files, I agree we > need something better. > > Here?s my draft work: > https://github.com/openjdk/jdk/pull/18773 > > Please consider adopting it as a foundation for better CDS > configuration reading. It?s old work, based to the repo > as of about a year ago. But it should rebase easily. > > Thanks, > ? John > > P.S. As a next layer I?d like to make something like sscanf, > except that it operates on the inputStream, like print_cr > does formatted output on outputStream, but in the reverse > direction. But for starters we need a solid foundation > for flexible line-oriented input. > > > On 7 Apr 2024, at 21:56, Ioi Lam wrote: > >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> ------------- >> >> Commit messages: >> - 8329728: Read arbitrarily long lines in ClassListParser >> >> Changes: https://git.openjdk.org/jdk/pull/18669/files >> Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18669&range=00 >> Issue: https://bugs.openjdk.org/browse/JDK-8329728 >> Stats: 235 lines in 5 files changed: 193 ins; 18 del; 24 mod >> Patch: https://git.openjdk.org/jdk/pull/18669.diff >> Fetch: git fetch https://git.openjdk.org/jdk.git pull/18669/head:pull/18669 >> >> PR: https://git.openjdk.org/jdk/pull/18669 From kvn at openjdk.org Mon Apr 15 03:24:07 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 15 Apr 2024 03:24:07 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 240 bytes. From 304 to 256 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1298 bytecoded nmethods for C2: > total in heap = 5766040 (100%) > header = 332288 (5.762846%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Union fields which usages do not overlap ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18768/files - new: https://git.openjdk.org/jdk/pull/18768/files/488f1b92..13744e78 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=01-02 Stats: 31 lines in 2 files changed: 15 ins; 13 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From jzhu at openjdk.org Mon Apr 15 03:32:50 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Mon, 15 Apr 2024 03:32:50 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: <6iifh18HJLCDWzb32dhVfpTcMjzXoXtSlmtY8ZoYzHc=.5b4491d7-3836-4312-bff1-49ef68ec15fa@github.com> On Wed, 20 Mar 2024 03:55:33 GMT, Joshua Zhu wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: > > Add more output for easy debugging once the jtreg test case fails Waiting for another review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-2054722588 From fyang at openjdk.org Mon Apr 15 04:11:41 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 04:11:41 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 12:16:44 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. > Thanks! > > Tests running ... Looks good. Thanks. BTW: There is another path where FP control register could got clobbered is through JNI. See: https://bugs.openjdk.org/browse/JDK-8320892. I think we might want to fix it for riscv too. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18758#pullrequestreview-1999982333 From rehn at openjdk.org Mon Apr 15 06:40:41 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 06:40:41 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 01:19:31 GMT, Fei Yang wrote: >> Hi please consider! >> >> jal do not have C switch, we always use the full length instructions. >> But jalr have, in case of an unbound Label which is to far for jal we can emit c_jalr. >> When we bind the Label we can't patch the c_jalr. >> >> Sanity tested. > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 926: > >> 924: assert_different_registers(Rd, temp); \ >> 925: /* We can't patch C, i.e. if Label wasn't bound we need to patch this jump.*/ \ >> 926: IncompressibleRegion ir(this); \ > > Seems that we rarely uses `jal` with unbound label. Here is what I see it. > For unbound label, we set `dest` to pc() at [1]. So the `distance` calculated at [2] will always be zero for this case. > After that we emit a simple `Assembler::jal(Rd, 0);` at [3] which is there for patching. So no `jalr` involved. > This means that possible future callers of `jal` with unbound label have to ensure that the target is not far. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L947 > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L913 > [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L915 Ah, now I understand. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565244525 From rehn at openjdk.org Mon Apr 15 06:40:41 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 06:40:41 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 06:37:27 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 926: >> >>> 924: assert_different_registers(Rd, temp); \ >>> 925: /* We can't patch C, i.e. if Label wasn't bound we need to patch this jump.*/ \ >>> 926: IncompressibleRegion ir(this); \ >> >> Seems that we rarely uses `jal` with unbound label. Here is what I see it. >> For unbound label, we set `dest` to pc() at [1]. So the `distance` calculated at [2] will always be zero for this case. >> After that we emit a simple `Assembler::jal(Rd, 0);` at [3] which is there for patching. So no `jalr` involved. >> This means that possible future callers of `jal` with unbound label have to ensure that the target is not far. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L947 >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L913 >> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L915 > > Ah, now I understand. Then I suggest we remove jalr from the method. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565245272 From rehn at openjdk.org Mon Apr 15 06:48:44 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 06:48:44 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: Message-ID: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> On Mon, 15 Apr 2024 06:38:13 GMT, Robbin Ehn wrote: >> Ah, now I understand. > > Then I suggest we remove jalr from the method. Note, I notice this when I enable c_jal, if we ever want c_jal we need to either eanble patching of C or turn off C when we need patching. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565252949 From matthias.baesken at sap.com Mon Apr 15 06:58:09 2024 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 15 Apr 2024 06:58:09 +0000 Subject: RFO: a tool to analyze HotSpot fatal error logs In-Reply-To: References: Message-ID: Btw. Another question ? does the tool/plugin only work with some special IDE (like IntellJ ) or is there a standalone mode ? Best regards, Matthias From: Maxim Kartashev Sent: Friday, 12 April 2024 14:46 To: Baesken, Matthias Cc: discuss at openjdk.org; hotspot-dev at openjdk.org; Doerr, Martin ; Langer, Christoph ; Schmidt, Lutz Subject: Re: RFO: a tool to analyze HotSpot fatal error logs > A colleague just today (in another context) pointed out the idea to have an option to select all the hserr event log sections into a single > Log with chronological order . That would probably also something this tool could do (or is it already implemented) . Not implemented as such, but certainly possible with some effort. If the tool is open-sourced such customization will be a lot easier on everybody. > You see them in the assertion failures or guarantees . But also in that native stacks (even with line numbers on some platforms) . Right. Having that in stacks is a relatively recent development, so we simply haven't caught up yet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fyang at openjdk.org Mon Apr 15 07:17:41 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 07:17:41 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> References: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> Message-ID: On Mon, 15 Apr 2024 06:46:19 GMT, Robbin Ehn wrote: >> Then I suggest we remove jalr from the method. > > Note, I notice this when I enable c_jal, if we ever want c_jal we need to either eanble patching of C or turn off C when we need patching. > Then I suggest we remove jalr from the method. I think we still need the `jalr` which is emitted by the else block of the method. The reason is that we have calls to `void jal(const address dest, Register temp = t0);` where we need to check whether `dest` is far or not [1][2]. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp#L1815 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L3532 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565280510 From maxim.kartashev at jetbrains.com Mon Apr 15 07:19:38 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Mon, 15 Apr 2024 11:19:38 +0400 Subject: RFO: a tool to analyze HotSpot fatal error logs In-Reply-To: References: Message-ID: On Mon, Apr 15, 2024 at 10:58?AM Baesken, Matthias wrote: > Btw. Another question ? does the tool/plugin only work with some special > IDE (like IntellJ ) or is there a standalone mode ? > > > It is an IntelliJ plugin. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fyang at openjdk.org Mon Apr 15 07:28:45 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 07:28:45 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> Message-ID: On Mon, 15 Apr 2024 07:13:11 GMT, Fei Yang wrote: >> Note, I notice this when I enable c_jal, if we ever want c_jal we need to either eanble patching of C or turn off C when we need patching. > >> Then I suggest we remove jalr from the method. > > I think we still need the `jalr` which is emitted by the else block of the method. The reason is that we have calls to `void jal(const address dest, Register temp = t0);` where we need to check whether `dest` is far or not [1][2]. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp#L1815 > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L3532 > Note, I notice this when I enable c_jal, if we ever want c_jal we need to either eanble patching of C or turn off C when we need patching. The current approach is to turn off C on sites where patching could happen. There are two `relocate` assembler functions for that purpose [1]. They will turn off C on those patching sites. We don't think it worth the complexity to enable patching of C. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/assembler_riscv.hpp#L2083-L2093 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565292244 From aph-open at littlepinkcloud.com Mon Apr 15 07:35:56 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Mon, 15 Apr 2024 08:35:56 +0100 Subject: AArch64: Math.pow() optimization In-Reply-To: References: Message-ID: <52786be7-ce69-4455-bd8a-952a064a10b2@littlepinkcloud.com> On 4/15/24 02:51, Jin Guojie wrote: > When we tested the mathematical operation performance of Java on the > Aarch64 platform, we found a bottleneck that was significantly different > from X86. Please have a close look at the specification for Math.pow(), and the requirements for monotonicity. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rehn at openjdk.org Mon Apr 15 07:39:44 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 07:39:44 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> Message-ID: On Mon, 15 Apr 2024 07:24:03 GMT, Fei Yang wrote: >>> Then I suggest we remove jalr from the method. >> >> I think we still need the `jalr` which is emitted by the else block of the method. The reason is that we have calls to `void jal(const address dest, Register temp = t0);` where we need to check whether `dest` is far or not [1][2]. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp#L1815 >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L3532 > >> Note, I notice this when I enable c_jal, if we ever want c_jal we need to either eanble patching of C or turn off C when we need patching. > > The current approach is to turn off C on sites where patching could happen. There are two `relocate` assembler functions for that purpose [1]. They will turn off C on those patching sites. We don't think it worth the complexity to enable patching of C. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/assembler_riscv.hpp#L2083-L2093 Those two plain jal don't use Labels, so they are unaffected. In this case, instead of turn off C we have no C switch for jal. So we don't have one approach. Turn off C would be inline with what you are saying. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565308263 From rehn at openjdk.org Mon Apr 15 07:47:05 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 07:47:05 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v2] In-Reply-To: References: Message-ID: <-K7ohGdj-8MeL6JpQOvxEOlq1hxefmEOYh2Yt3eDTkg=.9ebea1ea-0026-4067-9b10-e651b3d0ec48@github.com> > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Correct calc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18755/files - new: https://git.openjdk.org/jdk/pull/18755/files/5308f81c..c65c2dbd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18755.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18755/head:pull/18755 PR: https://git.openjdk.org/jdk/pull/18755 From rehn at openjdk.org Mon Apr 15 07:47:05 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 07:47:05 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 10:41:39 GMT, Robbin Ehn wrote: > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested Updated, thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18755#issuecomment-2055937353 From fyang at openjdk.org Mon Apr 15 07:57:43 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 07:57:43 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> Message-ID: On Mon, 15 Apr 2024 07:37:32 GMT, Robbin Ehn wrote: >>> Note, I notice this when I enable c_jal, if we ever want c_jal we need to either eanble patching of C or turn off C when we need patching. >> >> The current approach is to turn off C on sites where patching could happen. There are two `relocate` assembler functions for that purpose [1]. They will turn off C on those patching sites. We don't think it worth the complexity to enable patching of C. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/assembler_riscv.hpp#L2083-L2093 > > Those two plain jal don't use Labels, so they are unaffected. > > In this case, instead of turn off C we have no C switch for jal. > So we don't have one approach. Turn off C would be inline with what you are saying. BTW: Is `c_jal` usable for us? The RISC-V spec say: " C.JAL is an RV32C-only instruction " ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565330078 From shade at openjdk.org Mon Apr 15 07:59:07 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 07:59:07 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v5] In-Reply-To: References: Message-ID: <3KD8xa04p5mqN6w_xDjSUFGGrl5ybGIt0w9xOjQo5oo=.cf77f2a5-023d-428b-9846-04ac54cd03df@github.com> > This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. > > Additional testing: > - [x] MacOS AArch64 server fastdebug build passes > - [ ] Linux x86_64 server fastdebug, `all` > - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests > - [ ] Linux x86_64 server fastdebug, Maven CTW > - [ ] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: - Also tests - Drop the other check; dodge UB ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18751/files - new: https://git.openjdk.org/jdk/pull/18751/files/f3d75b39..3f6d76f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=03-04 Stats: 124 lines in 2 files changed: 116 ins; 5 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18751/head:pull/18751 PR: https://git.openjdk.org/jdk/pull/18751 From shade at openjdk.org Mon Apr 15 07:59:07 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 07:59:07 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: <17gipHM6B5g7uDlXUwE1lpgXSPKbkOeZAPd60uiEzgY=.c29f487e-1871-493e-9555-faea3c995068@github.com> Message-ID: On Fri, 12 Apr 2024 22:08:42 GMT, Andrew Haley wrote: > I think so. Several of us have worked on eliminating undefined behaviour in HotSpot and we've made good progress. I think it would be sad for new UB to be pushed now, especially in a case like this when it wouldn't be accidental. UB is just something we have to deal with, because C++. :-( All right then, see new commit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2055998776 From shade at openjdk.org Mon Apr 15 07:59:08 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 07:59:08 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v4] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 10:05:55 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - More straightforward > - Richer error reporting Testing update: I ran Maven CTW tests, Fuzzer tests, the entirety of JDK codebase, and I believe the only thing we are missing is [JDK-8330158](https://bugs.openjdk.org/browse/JDK-8330158). After that fix lands, we can integrate this patch, which would hopefully seal against introducing new bugs. I also added a direct gtest for ABS. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2056006999 From shade at openjdk.org Mon Apr 15 07:59:08 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 07:59:08 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v4] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 19:43:20 GMT, Dean Long wrote: >> Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: >> >> - More straightforward >> - Richer error reporting > > src/hotspot/share/utilities/globalDefinitions.hpp line 1119: > >> 1117: T res = (x > 0) ? x : -x; >> 1118: #ifdef ASSERT >> 1119: if (res < 0) { > > I don't see how we could ever hit this. Checking input seems sufficient. I thought we would like to check for other (unknown) cases for ABS failures, since we would otherwise presume only the `ABS(min_int)` is problematic. But I can remove it, and we would add more cases later if we find them. See new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18751#discussion_r1565329121 From aph-open at littlepinkcloud.com Mon Apr 15 08:08:56 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Mon, 15 Apr 2024 09:08:56 +0100 Subject: =?UTF-8?B?UmU6IOWbnuWkje+8mk1hdGg6IG9wdGltYXRpb24gZm9yIGRvaW5nIHJl?= =?UTF-8?Q?mainder_on_AArch64?= In-Reply-To: <71976e47-0831-41f3-a804-a876c7b85fd9.jinguojie.jgj@alibaba-inc.com> References: <258412a7-ab7d-4c90-80ad-715b23b32c30@littlepinkcloud.com> <71976e47-0831-41f3-a804-a876c7b85fd9.jinguojie.jgj@alibaba-inc.com> Message-ID: <22042763-dd67-41a5-a0b5-927afc50bf64@littlepinkcloud.com> On 4/15/24 03:12, Jin Guojie wrote: > The reason for this optimization is that we found that there are a large > number of remainder operations in the jdk class library, and the > frequency of these remainder operations is very high, which has a great > impact on performance. These remainder operations are frequently used in > fundamental libraries such as java.lang.String and java.lang.HashMap. [ Note: always use AArch64 in the title for AArch64-specific discussion.] I couldn't find any uses of remainder in either of those classes, but I didn't look for very long. I think that it probably does make sense to handle MADD specially for Neoverse, and it would also probably be good to get Arm to fix it. I think what we probably need is a suitable definition of MADD in MacroAssembler. Any heavy use of integer modulo should probably be referred back to the application developers. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From dnsimon at openjdk.org Mon Apr 15 08:10:45 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 15 Apr 2024 08:10:45 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: References: Message-ID: <_V38K8_1dehCvKMkxGyWe5Izu8y_COKJV94kubHdQUw=.7b6e29a8-e479-41ae-9df1-ac38def2b599@github.com> On Thu, 11 Apr 2024 13:50:25 GMT, Yudi Zheng wrote: > JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. Marked as reviewed by dnsimon (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18741#pullrequestreview-2000287053 From yzheng at openjdk.org Mon Apr 15 08:10:45 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 15 Apr 2024 08:10:45 GMT Subject: Integrated: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 13:50:25 GMT, Yudi Zheng wrote: > JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. This pull request has now been integrated. Changeset: 5404b4ea Author: Yudi Zheng Committer: Doug Simon URL: https://git.openjdk.org/jdk/commit/5404b4eafc2eb3291cecf99f98728946388f5d16 Stats: 47 lines in 2 files changed: 12 ins; 30 del; 5 mod 8330105: SharedRuntime::resolve* should respect interpreter-only mode Reviewed-by: never, dlong, dnsimon ------------- PR: https://git.openjdk.org/jdk/pull/18741 From fyang at openjdk.org Mon Apr 15 08:26:43 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 08:26:43 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v2] In-Reply-To: <-K7ohGdj-8MeL6JpQOvxEOlq1hxefmEOYh2Yt3eDTkg=.9ebea1ea-0026-4067-9b10-e651b3d0ec48@github.com> References: <-K7ohGdj-8MeL6JpQOvxEOlq1hxefmEOYh2Yt3eDTkg=.9ebea1ea-0026-4067-9b10-e651b3d0ec48@github.com> Message-ID: On Mon, 15 Apr 2024 07:47:05 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Correct calc src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 684: > 682: constexpr int64_t twoG = (2 * G); > 683: constexpr int64_t twoK = (2 * K); > 684: return x <= (twoG - twoK) && x >= (-twoG - twoK); Hmm.. I think the range should be `x < (twoG - twoK) && x >= (-twoG - twoK)` which maps to `[-(2G + 2K), 2G - 2K)`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18755#discussion_r1565363679 From burban at openjdk.org Mon Apr 15 08:35:42 2024 From: burban at openjdk.org (Bernhard Urban-Forster) Date: Mon, 15 Apr 2024 08:35:42 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 12 Apr 2024 14:40:05 GMT, Sergey Nazarkin wrote: > An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). > > Additional testing: > - [x] MacOS AArch64 server fastdebug *gtets* > - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* > - [ ] Benchmarking > > @apangin and @parttimenerd could you please check the patch on your scenarios?? I agree that this PR effectively removes the whole idea behind JIT_MAP and thus is a bad idea security wise. Still it has some value. @snazarkin do you have numbers on how many transitions are done with your PR vs. the current state when running the same program? That would give us a lower bound on the amount of transitions needed and define a goal for future improvements in that area. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2056182560 From aph at openjdk.org Mon Apr 15 08:47:44 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 08:47:44 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v5] In-Reply-To: <3KD8xa04p5mqN6w_xDjSUFGGrl5ybGIt0w9xOjQo5oo=.cf77f2a5-023d-428b-9846-04ac54cd03df@github.com> References: <3KD8xa04p5mqN6w_xDjSUFGGrl5ybGIt0w9xOjQo5oo=.cf77f2a5-023d-428b-9846-04ac54cd03df@github.com> Message-ID: <3gLQfbL3QE4bIYlDcqMmkOq9pQ5-SVhwvbu7qilyQZ8=.bacb88fd-30cf-4653-8f72-2f48e8083ac5@github.com> On Mon, 15 Apr 2024 07:59:07 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Also tests > - Drop the other check; dodge UB Thank you. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18751#pullrequestreview-2000369676 From jkern at openjdk.org Mon Apr 15 08:49:53 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 15 Apr 2024 08:49:53 GMT Subject: Integrated: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:50:20 GMT, Joachim Kern wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_gcc.hpp the compiler is much more nagging about ill formatted printf This pull request has now been integrated. Changeset: 3f1d9c44 Author: Joachim Kern Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/3f1d9c441ea98910d9483e133bccfac784db393d Stats: 256 lines in 15 files changed: 8 ins; 212 del; 36 mod 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc Reviewed-by: jwaters, stuefe, kbarrett, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/18536 From luhenry at openjdk.org Mon Apr 15 08:51:44 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 15 Apr 2024 08:51:44 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 12:16:44 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. > Thanks! > > Tests running ... Changes requested by luhenry (Committer). src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 429: > 427: // restore fcsr > 428: __ ld(t1, fcsr_save); > 429: __ csrw(CSR_FCSR, t1); It would be better to avoid the `csrw` in case the CSR is already set to `RoundingMode::rne` on some hardware. You can have avoid it with a simple check on the current value of FCSR. ------------- PR Review: https://git.openjdk.org/jdk/pull/18758#pullrequestreview-2000378872 PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565402389 From aboldtch at openjdk.org Mon Apr 15 09:37:52 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 15 Apr 2024 09:37:52 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. Message-ID: The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. Running JVMTI reproducer. Starting full testing soon. ------------- Commit messages: - Skip verify when deoptimizing from monitorenter bytecode. Changes: https://git.openjdk.org/jdk/pull/18782/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330253 Stats: 9 lines in 1 file changed: 7 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From rehn at openjdk.org Mon Apr 15 10:05:41 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 10:05:41 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v2] In-Reply-To: References: <-K7ohGdj-8MeL6JpQOvxEOlq1hxefmEOYh2Yt3eDTkg=.9ebea1ea-0026-4067-9b10-e651b3d0ec48@github.com> Message-ID: On Mon, 15 Apr 2024 08:21:11 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Correct calc > > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 684: > >> 682: constexpr int64_t twoG = (2 * G); >> 683: constexpr int64_t twoK = (2 * K); >> 684: return x <= (twoG - twoK) && x >= (-twoG - twoK); > > Hmm.. I think the range should be `x < (twoG - twoK) && x >= (-twoG - twoK)` which maps to `[-(2G + 2K), 2G - 2K)`. Sorry, yea, the max range is: 2147481599/0x7FFFF7FF and (twoG - twoK) is 2147481600. -2147485696/0xFFFFFFFF7FFFF800 (0xFFFFFFFF80000000 - 0xFFF) and (-twoG - twoK) is -2147485696. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18755#discussion_r1565515932 From rehn at openjdk.org Mon Apr 15 10:08:59 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 10:08:59 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v3] In-Reply-To: References: Message-ID: > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Non inclusive positive side ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18755/files - new: https://git.openjdk.org/jdk/pull/18755/files/c65c2dbd..80d9088d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18755.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18755/head:pull/18755 PR: https://git.openjdk.org/jdk/pull/18755 From rehn at openjdk.org Mon Apr 15 10:13:40 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 15 Apr 2024 10:13:40 GMT Subject: RFR: 8330161: RISC-V: Don't use C for Labels jumps In-Reply-To: References: <-YUtECqZ2EGaVkUlk0YRychmJKqT5mxYXSjlnsBZVh8=.49b29fb3-b3a9-4ed0-b0da-195fda3c7469@github.com> Message-ID: On Mon, 15 Apr 2024 07:54:46 GMT, Fei Yang wrote: >> Those two plain jal don't use Labels, so they are unaffected. >> >> In this case, instead of turn off C we have no C switch for jal. >> So we don't have one approach. Turn off C would be inline with what you are saying. > > BTW: Is `c_jal` usable for us? The RISC-V spec say: " C.JAL is an RV32C-only instruction " Sorry I was talking about C.J. As that is the one which maps to JAL uncond jumps. (x0) It's not used at all as it is now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18761#discussion_r1565529275 From fyang at openjdk.org Mon Apr 15 10:21:41 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 10:21:41 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 10:08:59 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Non inclusive positive side Update change looks good. Thank you! ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18755#pullrequestreview-2000639016 From fyang at openjdk.org Mon Apr 15 10:45:42 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 10:45:42 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 08:49:25 GMT, Ludovic Henry wrote: >> Hi, >> Can you help to review this patch? >> As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. >> Thanks! >> >> Tests running ... > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 429: > >> 427: // restore fcsr >> 428: __ ld(t1, fcsr_save); >> 429: __ csrw(CSR_FCSR, t1); > > It would be better to avoid the `csrw` in case the CSR is already set to `RoundingMode::rne` on some hardware. You can have avoid it with a simple check on the current value of FCSR. > > Something along the line of: > > __ ld(t1, fcsr_save); > __ csrr(t0, CSR_FCSR); > __ beq(t1, t0, skip_csrw); > __ csrw(CSR_FCSR, t1); > __ bind(skip_csrw); > > > Some doc about it: https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar Interesting. So is it safe to claim that `csrr` is faster than `csrw`? I didn't measure the difference. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565575491 From luhenry at openjdk.org Mon Apr 15 10:52:40 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 15 Apr 2024 10:52:40 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 10:43:34 GMT, Fei Yang wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 429: >> >>> 427: // restore fcsr >>> 428: __ ld(t1, fcsr_save); >>> 429: __ csrw(CSR_FCSR, t1); >> >> It would be better to avoid the `csrw` in case the CSR is already set to `RoundingMode::rne` on some hardware. You can have avoid it with a simple check on the current value of FCSR. >> >> Something along the line of: >> >> __ ld(t1, fcsr_save); >> __ csrr(t0, CSR_FCSR); >> __ beq(t1, t0, skip_csrw); >> __ csrw(CSR_FCSR, t1); >> __ bind(skip_csrw); >> >> >> Some doc about it: https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar > > Interesting. So is it safe to claim that `csrr` is faster than `csrw`? I didn't measure the difference. It's fair to claim that `csrw` has side effects which may be more detrimental to performance, side effects that `csrr` that doesn't have (and can't as it's only reading). > I didn't measure the difference. I don't know how that affects current hardware, but remember that current hardware are in-order CPUs which may not be impacted by many of these performance problems. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565583331 From jsjolen at openjdk.org Mon Apr 15 11:27:12 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 15 Apr 2024 11:27:12 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v31] In-Reply-To: References: Message-ID: <3TXgAeyGvA6EQAibDhgBPWtksJ-0f_BFIIFLLIg2Yd0=.39603f50-dccb-4a75-a21c-419ce0bc7f91@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Change tests and remove merging for committed memory in aniticipation of Afshin's commit - Change name of accessor from mdata to metadata ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/3793046e..b97d3282 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=30 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=29-30 Stats: 46 lines in 3 files changed: 4 ins; 25 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From ihse at openjdk.org Mon Apr 15 11:48:50 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 15 Apr 2024 11:48:50 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 22:14:33 GMT, Kim Barrett wrote: >> That build failure in shared code does not happen with Xcode clang, gcc, or >> Visual Studio, even though none of them appear to have a relevant define or >> include. So the clang variant being used for AIX is different from the Xcode >> clang variant (and maybe others) in its treatment of alloca. Weird! >> >> I can also live with either the macro or the includes where needed. I dislike >> conditionally adding the include in globalDefinitions_gcc.hpp. > > Should also remove the `#pragma alloca` in os_aix.cpp. It was too bad that I did not see and review this change in the makefiles. :-( While you guys could have gone either way, I strongly dislike the choice to include a redefinition in the makefiles. If this really should be done, we should introduced a new variable to carry such changes, instead of piggybacking it with the OS defines. :-( But, I don't think it should be done at all. There are several reasons why this is a inferior solution: 1) It does not follow prior examples. We have tried hard before not do things like this, but rather pass flags as defines (e.g. `-DREDEFINE_ALLOCA` had been better) 2) It does not scale. If we start in effect allowing code in the command line, there is no clear limit anymore what should be placed in the source code files and what should be placed on the command line. 3) It messes up command lines. Keeping command lines as short as reasonable possible is a goal we try to strive for. In this case, there is also the `'` inside them (which I don't understand why), which is just begging for quoting/escaping problems, making command lines hard to copy/paste, send to different systems (like logging) etc. I'd really like to see a follow-up PR that moves this away from the command line define and into a source code file instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1565646578 From fyang at openjdk.org Mon Apr 15 11:56:41 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 11:56:41 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 10:49:14 GMT, Ludovic Henry wrote: >> Interesting. So is it safe to claim that `csrr` is faster than `csrw`? I didn't measure the difference. > > It's fair to claim that `csrw` has side effects which may be more detrimental to performance, side effects that `csrr` that doesn't have (and can't as it's only reading). > >> I didn't measure the difference. > > I don't know how that affects current hardware, but remember that current hardware are in-order CPUs which may not be impacted by many of these performance problems. Another thing that is worth considering is that the `fflags` (exception flags) contained in `fcsr` is very likely to change after the Java call which does floating point calculations. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565655591 From aph at openjdk.org Mon Apr 15 12:33:05 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 12:33:05 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v15] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Working secondary supers table with -XX:-UsePopCountInstruction ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/02b1837e..486336bd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=13-14 Stats: 47 lines in 3 files changed: 35 ins; 8 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From aph at openjdk.org Mon Apr 15 12:39:47 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 12:39:47 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v15] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 12:33:05 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Working secondary supers table with -XX:-UsePopCountInstruction So it turns out that even without a POPCNT intstruction, this algorithm is still faster than the current linear search in all reasonable cases. I've pushed a change that uses a hand-coded population count. soft popcount current linear search Lookup.testPositive01 1.537 ? 0.151 11.955 ? 0.011 ns/op Lookup.testPositive02 1.535 ? 0.251 12.499 ? 0.029 ns/op Lookup.testPositive03 2.279 ? 0.141 13.041 ? 0.010 ns/op Lookup.testPositive04 1.819 ? 0.036 13.579 ? 0.051 ns/op Lookup.testPositive05 1.537 ? 0.187 15.755 ? 0.079 ns/op Lookup.testPositive06 3.121 ? 0.015 14.672 ? 0.016 ns/op Lookup.testPositive07 2.270 ? 0.004 15.215 ? 0.006 ns/op Lookup.testPositive08 4.445 ? 0.225 15.849 ? 2.889 ns/op Lookup.testPositive09 3.122 ? 0.001 16.286 ? 8.472 ns/op Lookup.testPositive10 1.820 ? 0.048 17.030 ? 14.481 ns/op Lookup.testPositive16 7.228 ? 0.176 19.606 ? 1.130 ns/op Lookup.testPositive20 3.503 ? 0.016 22.186 ? 4.257 ns/op Lookup.testPositive30 6.672 ? 0.175 27.960 ? 0.381 ns/op Lookup.testPositive32 12.031 ? 0.619 28.794 ? 0.107 ns/op Lookup.testPositive40 12.502 ? 0.242 33.415 ? 0.024 ns/op Lookup.testPositive50 24.355 ? 2.444 39.612 ? 15.678 ns/op Lookup.testPositive60 41.386 ? 1.143 44.529 ? 1.298 ns/op Lookup.testPositive63 68.823 ? 0.992 46.187 ? 0.042 ns/op Lookup.testPositive64 48.325 ? 2.477 46.721 ? 0.198 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2056747930 From aph at openjdk.org Mon Apr 15 12:39:48 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 12:39:48 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 18:01:22 GMT, Vladimir Kozlov wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/cpu/x86/stubRoutines_x86.hpp line 41: > >> 39: // Windows have more code to save/restore registers >> 40: _compiler_stubs_code_size = 20000 LP64_ONLY(+39000) WINDOWS_ONLY(+2000), >> 41: _final_stubs_code_size = 10000 LP64_ONLY(+20000) WINDOWS_ONLY(+2000) ZGC_ONLY(+24000) > > Do we still need it after you moved code to compiler stubs section? It's a bug in that the 20000 byte figure is an underestimate for what zgc stubs need. I could take it out for this patch, I guess, but it'd still be a bug. > src/hotspot/cpu/x86/vm_version_x86.cpp line 1786: > >> 1784: } >> 1785: FLAG_SET_DEFAULT(UseSecondarySupersTable, false); >> 1786: } > > No need this with other changes I suggest. Never mind, it's gone. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1565723607 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1565724331 From mli at openjdk.org Mon Apr 15 13:00:44 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 13:00:44 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 11:54:04 GMT, Fei Yang wrote: >> It's fair to claim that `csrw` has side effects which may be more detrimental to performance, side effects that `csrr` that doesn't have (and can't as it's only reading). >> >>> I didn't measure the difference. >> >> I don't know how that affects current hardware, but remember that current hardware are in-order CPUs which may not be impacted by many of these performance problems. > > Another thing that is worth considering is that the `fflags` (exception flags) contained in `fcsr` is very likely to change after the Java call which does floating point calculations. Should we replace read/write/comp of fcsr with frm only? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565753690 From mdoerr at openjdk.org Mon Apr 15 13:12:50 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 15 Apr 2024 13:12:50 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Mon, 15 Apr 2024 11:46:25 GMT, Magnus Ihse Bursie wrote: >> Should also remove the `#pragma alloca` in os_aix.cpp. > > It was too bad that I did not see and review this change in the makefiles. :-( > > While you guys could have gone either way, I strongly dislike the choice to include a redefinition in the makefiles. If this really should be done, we should introduced a new variable to carry such changes, instead of piggybacking it with the OS defines. :-( But, I don't think it should be done at all. > > There are several reasons why this is a inferior solution: > > 1) It does not follow prior examples. We have tried hard before not do things like this, but rather pass flags as defines (e.g. `-DREDEFINE_ALLOCA` had been better) > 2) It does not scale. If we start in effect allowing code in the command line, there is no clear limit anymore what should be placed in the source code files and what should be placed on the command line. > 3) It messes up command lines. Keeping command lines as short as reasonable possible is a goal we try to strive for. In this case, there is also the `'` inside them (which I don't understand why), which is just begging for quoting/escaping problems, making command lines hard to copy/paste, send to different systems (like logging) etc. > > I'd really like to see a follow-up PR that moves this away from the command line define and into a source code file instead. Can we unconditionally `#include ` in all files which use `alloca`? Or does that disturb any platform? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1565770190 From fyang at openjdk.org Mon Apr 15 13:16:42 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 15 Apr 2024 13:16:42 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub In-Reply-To: References: Message-ID: <9z88boDGs1jKol0DwpV5kZDDmV_CbApvTRkL3zrq6aQ=.437bc93b-c8f5-4740-a99e-addc8e8346bc@github.com> On Mon, 15 Apr 2024 12:58:23 GMT, Hamlin Li wrote: >> Another thing that is worth considering is that the `fflags` (exception flags) contained in `fcsr` is very likely to change after the Java call which does floating point calculations. > > Should we replace read/write/comp of fcsr with frm only? Yeah, that make sense to me. I think `fflags` is safe to ignore here. (And you will also need to update this code comment in file: src/hotspot/cpu/riscv/stubGenerator_riscv.cpp if you do that: ` // -34 [ saved Floating-point Control and Status Register ] <--- sp_after_call`) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565772009 From mli at openjdk.org Mon Apr 15 13:23:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 13:23:10 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub [v2] In-Reply-To: <9z88boDGs1jKol0DwpV5kZDDmV_CbApvTRkL3zrq6aQ=.437bc93b-c8f5-4740-a99e-addc8e8346bc@github.com> References: <9z88boDGs1jKol0DwpV5kZDDmV_CbApvTRkL3zrq6aQ=.437bc93b-c8f5-4740-a99e-addc8e8346bc@github.com> Message-ID: On Mon, 15 Apr 2024 13:11:40 GMT, Fei Yang wrote: >> Should we replace read/write/comp of fcsr with frm only? > > Yeah, that make sense to me. I think `fflags` is safe to ignore here. > (And you will also need to update this code comment in file: src/hotspot/cpu/riscv/stubGenerator_riscv.cpp if you do that: ` // -34 [ saved Floating-point Control and Status Register ] <--- sp_after_call`) Thanks for discussion, all updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1565784889 From mli at openjdk.org Mon Apr 15 13:23:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 13:23:10 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub [v2] In-Reply-To: References: Message-ID: > Hi, > Can you help to review this patch? > As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. > Thanks! > > Tests running ... Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: - update comment accordingly - skip setting frm when not needed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18758/files - new: https://git.openjdk.org/jdk/pull/18758/files/75d27fae..0e7b2414 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=00-01 Stats: 16 lines in 1 file changed: 8 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18758.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18758/head:pull/18758 PR: https://git.openjdk.org/jdk/pull/18758 From aph at openjdk.org Mon Apr 15 13:51:47 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 13:51:47 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Fri, 12 Apr 2024 17:52:10 GMT, Vladimir Kozlov wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/share/cds/filemap.hpp line 274: > >> 272: bool compressed_oops() const { return _compressed_oops; } >> 273: bool compressed_class_pointers() const { return _compressed_class_ptrs; } >> 274: bool use_secondary_supers_table() const { return _use_secondary_supers_table; } > > Do we really need this accessor which is used only in one place? @iwanowww , this one is yours. May I nuke this method? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1565833170 From pchilanomate at openjdk.org Mon Apr 15 14:27:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 15 Apr 2024 14:27:01 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: References: Message-ID: <6rPq8e5Vt4lURibkFA-2gw_0L3rUsU-ds-iXF3zTGXE=.0823f0fe-2689-4d42-968d-f460b5bb941f@github.com> On Thu, 11 Apr 2024 13:50:25 GMT, Yudi Zheng wrote: > JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. This is the same initial fix I proposed for JDK-8302351 but which I later changed when stumbling upon some exception cases where we cannot just return the c2i adapter entry: method handle intrinsics and enterSpecial/doYield methods. For method handle intrinsics, _linkToNative doesn't have an interpreter version so the c2i will lead to a i2c and we will crash because we cannot cascade those. For the other method handle intrinsics, although there is an interpreter version, I found another issue where generate_method_handle_interpreter_entry() can throw an exception before we create the interpreter frame, which will lead to crashes when walking the stack. Regarding enterSpecial/doYield, those also lack an interpreter version as _linkToNative(although enterSpecial has a hack here), but they are not really an issue today because we cannot switch to interpreter only mode while resolving those methods. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18741#issuecomment-2056989452 From sgibbons at openjdk.org Mon Apr 15 14:30:15 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 14:30:15 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v15] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Change fill routines * Even more review comments * Re-write of atomic copy loops ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/6e731c86..405e4e05 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=13-14 Stats: 251 lines in 1 file changed: 103 ins; 117 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From pchilanomate at openjdk.org Mon Apr 15 14:53:12 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 15 Apr 2024 14:53:12 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:23:50 GMT, Patricio Chilano Mateo wrote: > There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. > > Thanks, > Patricio I updated the PR to move ResourceMark out of debug mode only. I've been running several benchmarks just to double check and as I expected I didn't found any issues. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18632#issuecomment-2057049044 From pchilanomate at openjdk.org Mon Apr 15 14:53:12 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 15 Apr 2024 14:53:12 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: > There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: take ResourceMark out of debug only ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18632/files - new: https://git.openjdk.org/jdk/pull/18632/files/33354c7c..1636b16c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18632&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18632&range=00-01 Stats: 15 lines in 3 files changed: 1 ins; 14 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18632.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18632/head:pull/18632 PR: https://git.openjdk.org/jdk/pull/18632 From aph at openjdk.org Mon Apr 15 15:08:31 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 15:08:31 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/486336bd..c43e9c6a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=14-15 Stats: 18 lines in 2 files changed: 9 ins; 9 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From coleenp at openjdk.org Mon Apr 15 15:23:02 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 15 Apr 2024 15:23:02 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 14:53:12 GMT, Patricio Chilano Mateo wrote: >> There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > take ResourceMark out of debug only This looks good. Thank you. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18632#pullrequestreview-2001407605 From azafari at openjdk.org Mon Apr 15 15:35:56 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 15:35:56 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v6] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: mtNone is not used anymore as default for optional args. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/a3940639..930d2748 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=04-05 Stats: 109 lines in 19 files changed: 23 ins; 9 del; 77 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Mon Apr 15 15:41:06 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 15:41:06 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v2] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3id02uEHGujrMSIg0lKlstLRL0x2yTsn7lPWrmEwGBU=.747fa9c2-e782-462f-95ba-bc567944a502@github.com> Message-ID: On Fri, 12 Apr 2024 08:15:56 GMT, Stefan Karlsson wrote: >> Fixed. > > It still looks wrong. Should be fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1565994207 From azafari at openjdk.org Mon Apr 15 15:41:06 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 15:41:06 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 12 Apr 2024 07:45:43 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> review comments applied. > > src/hotspot/share/nmt/virtualMemoryTracker.hpp line 307: > >> 305: >> 306: ReservedMemoryRegion(address base, size_t size, MEMFLAGS flag) : >> 307: VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(flag) { } > > The function above uses mtNone. I find that a bit dubious, but I understand that it is done to be able to write code like this: > > ReservedMemoryRegion* rmr = VirtualMemoryTracker::_reserved_regions->find(ReservedMemoryRegion(addr, size)); > > > Unfortunately, it opens up the door for people to accidentally use that version instead of this new version that you have written. Could we get rid of the version using mtNone somehow? > > The same question goes for the version above that, which has a "MEMFLAGS flag = mtNone". (GH doesn't allow me to comment on lines that you haven't changed) `mtNone` as default value is no longer valid. > src/hotspot/share/runtime/os.hpp line 511: > >> 509: // and is added to be used for implementation of -XX:AllocateHeapAt >> 510: static char* map_memory_to_file(size_t size, int fd, MEMFLAGS flag = mtNone); >> 511: static char* map_memory_to_file_aligned(size_t size, size_t alignment, int fd, MEMFLAGS flag); > > There are still a few mtNone usages in this file. `mtNone` as default value for optional arguments is removed from all function definitions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1565995694 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1565997114 From azafari at openjdk.org Mon Apr 15 15:41:07 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 15:41:07 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: <4eN_yJUIi_0MTBROX0yxeIZIYo4W3KNlBGGOSA3glI4=.8e6ec837-1cb3-414f-959c-86fb3e3c9907@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <4eN_yJUIi_0MTBROX0yxeIZIYo4W3KNlBGGOSA3glI4=.8e6ec837-1cb3-414f-959c-86fb3e3c9907@github.com> Message-ID: <8FDhtk1BZwnz60dyaHDmEzcgQUjX2bCGf8nn7nA5WrY=.eaccab67-6ba1-499d-946d-cb07383cf16b@github.com> On Fri, 12 Apr 2024 07:59:06 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> review comments applied. > > src/hotspot/share/runtime/os.hpp line 521: > >> 519: bool allow_exec = false, MEMFLAGS flags = mtNone); >> 520: static bool unmap_memory(char *addr, size_t bytes); >> 521: static void free_memory(char *addr, size_t bytes, size_t alignment_hint, MEMFLAGS flag); > > While looking at this, I noticed a couple of odd things about this function. I think it should be revised and I opened https://bugs.openjdk.org/browse/JDK-8330144. The result of that revision will be that we don't need MEMFLAGS, nor do would we need the alignment hint. > > But leave the MEMFLAGS in for now. If I happen to push that change first, you can adapt the change, if you push first I'll manage. OK. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1565997790 From mli at openjdk.org Mon Apr 15 15:41:27 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 15:41:27 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI Message-ID: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Hi, Can you help to review this patch? As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 Thanks ------------- Commit messages: - fix typo - Initial commit Changes: https://git.openjdk.org/jdk/pull/18785/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18785&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330266 Stats: 25 lines in 5 files changed: 25 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18785.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18785/head:pull/18785 PR: https://git.openjdk.org/jdk/pull/18785 From aph-open at littlepinkcloud.com Mon Apr 15 15:42:22 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Mon, 15 Apr 2024 16:42:22 +0100 Subject: Aarch64: optimation for doing remainder on AArch64 In-Reply-To: <067c71d9-b6fd-4684-8f0b-7071898d73b3.jinguojie.jgj@alibaba-inc.com> References: <22042763-dd67-41a5-a0b5-927afc50bf64@littlepinkcloud.com> <067c71d9-b6fd-4684-8f0b-7071898d73b3.jinguojie.jgj@alibaba-inc.com> Message-ID: On 4/15/24 10:45, Jin Guojie wrote: > Thank you if you could kindly review this patch again. If you can get a Github account and an OpenJDK account we can start to do that. The first thing for you to do is clone the OpenJDK repo into your own tree, then create a local branch, then create a PR. See the section https://openjdk.org/guide/#i-have-a-patch-what-do-i-do https://openjdk.org/guide/ -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph-open at littlepinkcloud.com Mon Apr 15 15:46:22 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Mon, 15 Apr 2024 16:46:22 +0100 Subject: Aarch64: optimation for doing remainder on AArch64 In-Reply-To: <067c71d9-b6fd-4684-8f0b-7071898d73b3.jinguojie.jgj@alibaba-inc.com> References: <22042763-dd67-41a5-a0b5-927afc50bf64@littlepinkcloud.com> <067c71d9-b6fd-4684-8f0b-7071898d73b3.jinguojie.jgj@alibaba-inc.com> Message-ID: <0716b8c6-91ed-4554-93ad-14b816eaf232@littlepinkcloud.com> On 4/15/24 10:45, Jin Guojie wrote: > In the source code directory of the jdk class library, we found multiple files that use the remainder operation. > > Some of them are listed as follows: Yes, but many of them don't actually do any division. Nonetheless, OK, a 15% speed improvement for integer modulo is worth having. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From azafari at openjdk.org Mon Apr 15 15:50:07 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 15:50:07 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <-5QgNzuuNNMIYopm2_6TQa09bBKILK8RInI-HnMYdJY=.282cd643-3e61-416f-bd16-2db958e5f69c@github.com> On Fri, 12 Apr 2024 07:33:51 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> review comments applied. > > src/hotspot/share/memory/virtualspace.cpp line 615: > >> 613: >> 614: ReservedHeapSpace::ReservedHeapSpace(size_t size, size_t alignment, size_t page_size, const char* heap_allocation_directory) : ReservedSpace() { >> 615: set_nmt_flag(mtJavaHeap); > > It seems odd that we only initialize the _nmt_flag when `size == 0`. Could this be done after that check? If not, why not? > > There's also a call to record_virtual_memory_type further down in the code. Why is that needed? Why isn't it enough to pass in the correct type to the os::reserve_memory call in the initialize function? Corrected. > src/hotspot/share/memory/virtualspace.cpp line 672: > >> 670: size_t rs_align, >> 671: size_t rs_page_size) : ReservedSpace() { >> 672: set_nmt_flag(mtCode); > > Why isn't this a part of the initialize call? This looks like a bug to me. `initialize` will call clear_members, which will undo this setting. `set_nmt_flag()`, `initialize` and `initialize_members` are chagned accorrding to the related comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566006111 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566009377 From azafari at openjdk.org Mon Apr 15 16:11:13 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 16:11:13 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: alignment in coding style changed. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/930d2748..abcfcccd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Mon Apr 15 16:11:15 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 16:11:15 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 12 Apr 2024 07:06:50 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> review comments applied. > > src/hotspot/os/windows/os_windows.cpp line 3137: > >> 3135: // If reservation failed, return null >> 3136: if (p_buf == nullptr) return nullptr; >> 3137: MemTracker::record_virtual_memory_reserve((address)p_buf, size_of_reserve, CALLER_PC, mtInternal); > > I think that allocate_pages_individually should take a MEMFLAGS argument instead of using mtInternal here. It takes now. All corresponding functions in call hierarchy are also changed. > src/hotspot/os/windows/os_windows.cpp line 3198: > >> 3196: // the release. >> 3197: MemTracker::record_virtual_memory_reserve((address)p_buf, >> 3198: bytes_to_release, CALLER_PC, mtNone); > > I don't think we should ever use `mtNone` in code outside of the NMT code. If you follow my suggestion above that allocate_pages_individually should take a MEMFLAG arg, then it could be used here. Corrected. > src/hotspot/os/windows/os_windows.cpp line 3218: > >> 3216: MemTracker::record_virtual_memory_reserve_and_commit((address)p_buf, bytes, CALLER_PC); >> 3217: } else { >> 3218: MemTracker::record_virtual_memory_reserve((address)p_buf, bytes, CALLER_PC, mtNone); > > Use the correct MEMFLAG here instead of mtNone. Done. > src/hotspot/os/windows/os_windows.cpp line 3771: > >> 3769: if (!is_committed) { >> 3770: commit_memory_or_exit(addr, bytes, prot == MEM_PROT_RWX, >> 3771: "cannot commit protection page", mtNone); > > This should probably be something else than mtNone. Changed to `mtInternal`. When `protect_memory` is called with uncommitted area (`is_committed == false`), the flag is mtInternal. > src/hotspot/share/jfr/recorder/storage/jfrVirtualMemory.cpp line 107: > >> 105: _rs = ReservedSpace(reservation_size_request_bytes, >> 106: os::vm_allocation_granularity(), >> 107: os::vm_page_size(), mtTracing); > > The mtTracing should probably be on a separate line, so that it follows the style of the surrounding code. Done. > src/hotspot/share/memory/virtualspace.cpp line 45: > >> 43: // Dummy constructor >> 44: ReservedSpace::ReservedSpace() : _base(nullptr), _size(0), _noaccess_prefix(0), >> 45: _alignment(0), _special(false), _fd_for_heap(-1), _nmt_flag(mtNone), _executable(false) { > > In almost all code we pass in the executable before the flag, but in ReservedSpace the flag is located before the executable. I think it would be nice to flip the order in this class. I understand that _executable is in the private section, while the other members are protected, but I don't think that it needs to be that way. The _executable could probably just be moved together with the rest of the members. > > OTOH, I think the entire class needs some cleanups. Let's leave this for a separate RFE. Changed. > src/hotspot/share/memory/virtualspace.cpp line 708: > >> 706: assert(max_commit_granularity > 0, "Granularity must be non-zero."); >> 707: >> 708: _nmt_flag = rs.nmt_flag(); > > The code seems to be written with blank lines to separate various members that belong together. Please add a blank line after this line. Fixed and moved. > src/hotspot/share/memory/virtualspace.hpp line 199: > >> 197: size_t _upper_alignment; >> 198: >> 199: MEMFLAGS _nmt_flag; > > The VirtualSpace::initialize functions used to initialize these members in the order that they are specified here. That is now messed up by adding the _nmt_flag at the end here, but in the beginning in the initialize function. I would propose that you move it to after _executable, both here and in the initialize function. Fixed. > test/hotspot/gtest/gc/g1/test_freeRegionList.cpp line 53: > >> 51: size_t bot_size = G1BlockOffsetTable::compute_size(heap.word_size()); >> 52: HeapWord* bot_data = NEW_C_HEAP_ARRAY(HeapWord, bot_size, mtGC); >> 53: ReservedSpace bot_rs(G1BlockOffsetTable::compute_size(heap.word_size()), mtGC); > > mtGC => mtTest? Done. > test/hotspot/gtest/gc/z/test_zForwarding.cpp line 103: > >> 101: _reserved = reserved; >> 102: >> 103: os::commit_memory((char*)_reserved, ZGranuleSize, !ExecMem /* executable */, mtGC); > > mtGC => mtTest? Done. > test/hotspot/gtest/gc/z/test_zForwarding.cpp line 114: > >> 112: ZGeneration::_young = _old_young; >> 113: if (_reserved != nullptr) { >> 114: os::uncommit_memory((char*)_reserved, ZGranuleSize, !ExecMem, mtGC); > > mtGC => mtTest? Done. > test/hotspot/gtest/memory/test_virtualspace.cpp line 223: > >> 221: return ReservedSpace(reserve_size_aligned, >> 222: os::vm_allocation_granularity(), >> 223: os::vm_page_size(), mtTest); > > newline before mtTest. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566039719 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566038228 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566037774 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566037525 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566033618 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566026381 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566027896 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566028800 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566029979 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566032851 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566032647 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566030330 From azafari at openjdk.org Mon Apr 15 16:11:15 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 16:11:15 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: <4eN_yJUIi_0MTBROX0yxeIZIYo4W3KNlBGGOSA3glI4=.8e6ec837-1cb3-414f-959c-86fb3e3c9907@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <4eN_yJUIi_0MTBROX0yxeIZIYo4W3KNlBGGOSA3glI4=.8e6ec837-1cb3-414f-959c-86fb3e3c9907@github.com> Message-ID: On Fri, 12 Apr 2024 07:42:11 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> review comments applied. > > src/hotspot/share/memory/metaspace/testHelpers.cpp line 81: > >> 79: if (reserve_limit > 0) { >> 80: // have reserve limit -> non-expandable context >> 81: _rs = ReservedSpace(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(), os::vm_page_size(), mtTest); > > mtMetaspace Done > src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 112: > >> 110: >> 111: // Commit... >> 112: if (os::commit_memory((char*)p, word_size * BytesPerWord, !ExecMem, _rs.nmt_flag()) == false) { > > just use mtMetaspace here, its easier Done. > src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 191: > >> 189: >> 190: // Uncommit... >> 191: if (os::uncommit_memory((char*)p, word_size * BytesPerWord, !ExecMem, _rs.nmt_flag()) == false) { > > mtMetaspace Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566031812 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566032265 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1566032016 From luhenry at openjdk.org Mon Apr 15 16:13:44 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 15 Apr 2024 16:13:44 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub [v2] In-Reply-To: References: Message-ID: <3XJeUy_e-cMDOI3q9weutZnIcIQvRDh9qIxzHETuDHU=.2e35e6c7-fdea-40a7-8c6d-34c82f12bdbe@github.com> On Mon, 15 Apr 2024 13:23:10 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. >> Thanks! >> >> Tests running ... > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - update comment accordingly > - skip setting frm when not needed Changes requested by luhenry (Committer). src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 308: > 306: Label skip_fsrmi; > 307: __ mv(t1, __ RoundingMode::rne); > 308: __ beq(t0, t1, skip_fsrmi); You can take advantage of `RoundingMode::rne == 0` by doing `__ beq(t0, zr, skip_fsrmi)` and remove the above `__ mv(t1, __ RoundingMode::rne);`. Please add a `guarantee(__ RoundingMode::rne == 0)` as well, just to self-document. ------------- PR Review: https://git.openjdk.org/jdk/pull/18758#pullrequestreview-2001522371 PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1566037120 From yzheng at openjdk.org Mon Apr 15 16:15:01 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Mon, 15 Apr 2024 16:15:01 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: <6rPq8e5Vt4lURibkFA-2gw_0L3rUsU-ds-iXF3zTGXE=.0823f0fe-2689-4d42-968d-f460b5bb941f@github.com> References: <6rPq8e5Vt4lURibkFA-2gw_0L3rUsU-ds-iXF3zTGXE=.0823f0fe-2689-4d42-968d-f460b5bb941f@github.com> Message-ID: On Mon, 15 Apr 2024 14:24:29 GMT, Patricio Chilano Mateo wrote: >> JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. > > This is the same initial fix I proposed for JDK-8302351 but which I later changed when stumbling upon some exception cases where we cannot just return the c2i adapter entry: method handle intrinsics and enterSpecial/doYield methods. > For method handle intrinsics, _linkToNative doesn't have an interpreter version so the c2i will lead to a i2c and we will crash because we cannot cascade those. For the other method handle intrinsics, although there is an interpreter version, I found another issue where generate_method_handle_interpreter_entry() can throw an exception before we create the interpreter frame, which will lead to crashes when walking the stack. > Regarding enterSpecial/doYield, those also lack an interpreter version as _linkToNative(although enterSpecial has a hack here), but they are not really an issue today because we cannot switch to interpreter only mode while resolving those methods. @pchilano how about we return c2i only if callee is not a method handle intrinsic? diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp index 2b06859c96d..74d361a2b57 100644 --- a/src/hotspot/share/runtime/sharedRuntime.cpp +++ b/src/hotspot/share/runtime/sharedRuntime.cpp @@ -1489,7 +1489,7 @@ JRT_END // return verified_code_entry if interp_only_mode is not set for the current thread; // otherwise return c2i entry. address SharedRuntime::get_resolved_entry(JavaThread* current, methodHandle callee_method) { - if (current->is_interp_only_mode()) { + if (current->is_interp_only_mode() && !callee_method->is_method_handle_intrinsic()) { // In interp_only_mode we need to go to the interpreted entry // The c2i won't patch in this mode -- see fixup_callers_callsite return callee_method->get_c2i_entry(); Btw how did you stress test this? https://github.com/openjdk/jdk/pull/14108#issuecomment-1574091628 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18741#issuecomment-2057229315 From luhenry at openjdk.org Mon Apr 15 16:15:43 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 15 Apr 2024 16:15:43 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: On Mon, 15 Apr 2024 15:37:06 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. > Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 > > Thanks Changes requested by luhenry (Committer). src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1177: > 1175: // Set FRM to the state we need. We do want Round to Nearest. We > 1176: // don't want non-IEEE rounding modes. > 1177: beq(tmp1, tmp2, skip_fsrmi); // Only reset FRM if it's wrong Same as https://github.com/openjdk/jdk/pull/18758#discussion_r1566037120, you can take advantage of `RoundingMode::rne == 0` by doing `beq(tmp1, zr, skip_fsrmi);` and adding a `guarantee(RoundingMode::rne == 0);` right above for self-documentation. ------------- PR Review: https://git.openjdk.org/jdk/pull/18785#pullrequestreview-2001532917 PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1566044382 From azafari at openjdk.org Mon Apr 15 16:16:44 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 15 Apr 2024 16:16:44 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v5] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <1zx5BUSqZfy81_KftcRahy0vtrBYXBDvZPxOApOqWcs=.39fd3462-221f-4a07-a043-6e2bc4dc918e@github.com> On Sat, 13 Apr 2024 05:38:11 GMT, Thomas Stuefe wrote: > Just a thought: one (manual) test I would do would be that several JVMs run with the same conditions (I would do at least one with -Xmx=Xms and AlwaysPreTouch) accumulate the same NMT numbers, current, and peak. Just to make sure we use the same flags before and after. I understand your idea as: run JVM with options to show virtual memory report, one for master branch (before this PR) and one using the PR's branch. Then it is expected that the reports show the same numbers. Right? Nice idea. Will do it. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2057232308 From shade at openjdk.org Mon Apr 15 16:22:43 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 16:22:43 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 14:53:12 GMT, Patricio Chilano Mateo wrote: >> There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > take ResourceMark out of debug only src/hotspot/share/interpreter/oopMapCache.cpp line 184: > 182: } > 183: > 184: InterpreterOopMap::~InterpreterOopMap() { Question: If we remove this opportunistic cleanup of `_bit_mask`, does it mean we might introduce memory inefficiencies in cases where `InterpreterOopMap` is not covered by close-by `ResourceMark`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1566060609 From mli at openjdk.org Mon Apr 15 16:25:14 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 16:25:14 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> > Hi, > Can you help to review this patch? > As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. > Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 > > Thanks Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: refine code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18785/files - new: https://git.openjdk.org/jdk/pull/18785/files/096bd0de..59a488d7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18785&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18785&range=00-01 Stats: 9 lines in 5 files changed: 1 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18785.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18785/head:pull/18785 PR: https://git.openjdk.org/jdk/pull/18785 From mli at openjdk.org Mon Apr 15 16:30:14 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 16:30:14 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub [v3] In-Reply-To: References: Message-ID: > Hi, > Can you help to review this patch? > As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. > Thanks! > > Tests running ... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: refine code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18758/files - new: https://git.openjdk.org/jdk/pull/18758/files/0e7b2414..7da4b991 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18758.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18758/head:pull/18758 PR: https://git.openjdk.org/jdk/pull/18758 From mli at openjdk.org Mon Apr 15 16:30:14 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 16:30:14 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub [v2] In-Reply-To: <3XJeUy_e-cMDOI3q9weutZnIcIQvRDh9qIxzHETuDHU=.2e35e6c7-fdea-40a7-8c6d-34c82f12bdbe@github.com> References: <3XJeUy_e-cMDOI3q9weutZnIcIQvRDh9qIxzHETuDHU=.2e35e6c7-fdea-40a7-8c6d-34c82f12bdbe@github.com> Message-ID: On Mon, 15 Apr 2024 16:07:17 GMT, Ludovic Henry wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - update comment accordingly >> - skip setting frm when not needed > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 308: > >> 306: Label skip_fsrmi; >> 307: __ mv(t1, __ RoundingMode::rne); >> 308: __ beq(t0, t1, skip_fsrmi); > > You can take advantage of `RoundingMode::rne == 0` by doing `__ beq(t0, zr, skip_fsrmi)` and remove the above `__ mv(t1, __ RoundingMode::rne);`. Please add a `guarantee(__ RoundingMode::rne == 0)` as well, just to self-document. Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1566084779 From mli at openjdk.org Mon Apr 15 16:30:00 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 15 Apr 2024 16:30:00 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: On Mon, 15 Apr 2024 16:12:35 GMT, Ludovic Henry wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> refine code > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1177: > >> 1175: // Set FRM to the state we need. We do want Round to Nearest. We >> 1176: // don't want non-IEEE rounding modes. >> 1177: beq(tmp1, tmp2, skip_fsrmi); // Only reset FRM if it's wrong > > Same as https://github.com/openjdk/jdk/pull/18758#discussion_r1566037120, you can take advantage of `RoundingMode::rne == 0` by doing `beq(tmp1, zr, skip_fsrmi);` and adding a `guarantee(RoundingMode::rne == 0);` right above for self-documentation. Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1566084422 From snazarki at openjdk.org Mon Apr 15 16:34:01 2024 From: snazarki at openjdk.org (Sergey Nazarkin) Date: Mon, 15 Apr 2024 16:34:01 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Mon, 15 Apr 2024 08:33:25 GMT, Bernhard Urban-Forster wrote: > do you have numbers on how many transitions are done with your PR vs. the current state when running the same program? With just simple **java -version** it is ~180 vs ~9500 (new vs old), for **java -help** ~1120 vs ~86300. For the applications the ration is about the same. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2057280998 From kvn at openjdk.org Mon Apr 15 17:01:59 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 15 Apr 2024 17:01:59 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 09:15:44 GMT, Axel Boldt-Christmas wrote: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Running JVMTI reproducer. Starting full testing soon. src/hotspot/share/runtime/deoptimization.cpp line 443: > 441: } > 442: #ifdef ASSERT > 443: if (LockingMode == LM_LIGHTWEIGHT && !realloc_failures && lock_order.is_nonempty()) { I think ` lock_order.is_nonempty()` check is enough here because it will not be empty only if other two conditions are true. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1566164427 From kvn at openjdk.org Mon Apr 15 17:14:45 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 15 Apr 2024 17:14:45 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: Message-ID: <3fkFno_cVUgduiyJTYsDfZAYseIO7lllrSUl6949lno=.50f45324-fb13-4ada-a4d7-14fe5de87b69@github.com> On Mon, 15 Apr 2024 15:08:31 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Seems fine now. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18309#pullrequestreview-2001743934 From aph at openjdk.org Mon Apr 15 17:18:41 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 15 Apr 2024 17:18:41 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 08:00:20 GMT, kuaiwei wrote: >> The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: >> 1 It show regression in some platform, like Apple silicon in mac os >> 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" >> >> It can be fixed by: >> 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) >> 2 Check the special pattern and merge the subsequent dmb. >> >> It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. >> >> This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. >> >> - Update: >> After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. > > kuaiwei has updated the pull request incrementally with one additional commit since the last revision: > > Fix arm build error Hi, I guess this isn't quite ready for review let. I'll have another look whan it is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2057431210 From pchilanomate at openjdk.org Mon Apr 15 17:29:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 15 Apr 2024 17:29:01 GMT Subject: RFR: 8330105: SharedRuntime::resolve* should respect interpreter-only mode In-Reply-To: <6rPq8e5Vt4lURibkFA-2gw_0L3rUsU-ds-iXF3zTGXE=.0823f0fe-2689-4d42-968d-f460b5bb941f@github.com> References: <6rPq8e5Vt4lURibkFA-2gw_0L3rUsU-ds-iXF3zTGXE=.0823f0fe-2689-4d42-968d-f460b5bb941f@github.com> Message-ID: On Mon, 15 Apr 2024 14:24:29 GMT, Patricio Chilano Mateo wrote: >> JavaThread::set_interp_only_mode may be called while a thread is blocked waiting for a JIT compilation to complete. When interpreter-only mode is set, we should dispatch to interpreter instead of the returned compiled code. > > This is the same initial fix I proposed for JDK-8302351 but which I later changed when stumbling upon some exception cases where we cannot just return the c2i adapter entry: method handle intrinsics and enterSpecial/doYield methods. > For method handle intrinsics, _linkToNative doesn't have an interpreter version so the c2i will lead to a i2c and we will crash because we cannot cascade those. For the other method handle intrinsics, although there is an interpreter version, I found another issue where generate_method_handle_interpreter_entry() can throw an exception before we create the interpreter frame, which will lead to crashes when walking the stack. > Regarding enterSpecial/doYield, those also lack an interpreter version as _linkToNative(although enterSpecial has a hack here), but they are not really an issue today because we cannot switch to interpreter only mode while resolving those methods. > @pchilano how about we return c2i only if callee is not a method handle intrinsic? > > ``` > diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp > index 2b06859c96d..74d361a2b57 100644 > --- a/src/hotspot/share/runtime/sharedRuntime.cpp > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp > @@ -1489,7 +1489,7 @@ JRT_END > // return verified_code_entry if interp_only_mode is not set for the current thread; > // otherwise return c2i entry. > address SharedRuntime::get_resolved_entry(JavaThread* current, methodHandle callee_method) { > - if (current->is_interp_only_mode()) { > + if (current->is_interp_only_mode() && !callee_method->is_method_handle_intrinsic()) { > // In interp_only_mode we need to go to the interpreted entry > // The c2i won't patch in this mode -- see fixup_callers_callsite > return callee_method->get_c2i_entry(); > ``` > Sounds good. We can use is_special_native_intrinsic() instead just to avoid possible issues when testing Continuations alone. We can leave 8218403 open to fix these remaining cases. > Btw how did you stress test this? [#14108 (comment)](https://github.com/openjdk/jdk/pull/14108#issuecomment-1574091628) > To see the issue about the cascading c2i -> i2c you can just always return the c2i entry. For the other issue I mentioned you can set `current->set_interp_only_mode(true);` upon entry in get_resolved_entry() to always return the c2i entry (exclude _linkToNative) and then remove the JvmtiExport::can_post_interpreter_events() check in jump_from_method_handle() (otherwise we might end up in the same cascading issue). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18741#issuecomment-2057449313 From pchilanomate at openjdk.org Mon Apr 15 17:39:00 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 15 Apr 2024 17:39:00 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 16:19:41 GMT, Aleksey Shipilev wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> take ResourceMark out of debug only > > src/hotspot/share/interpreter/oopMapCache.cpp line 184: > >> 182: } >> 183: >> 184: InterpreterOopMap::~InterpreterOopMap() { > > Question: If we remove this opportunistic cleanup of `_bit_mask`, does it mean we might introduce memory inefficiencies in cases where `InterpreterOopMap` is not covered by close-by `ResourceMark`? In theory yes, although I doubt it is an actual issue because allocating in the resource area for this _bit_mask field is a rare case, and the memory allocated will be a few bytes. But I guess we can keep the eager cleanup just in case since it doesn't hurt. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1566212635 From never at openjdk.org Mon Apr 15 17:55:54 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 17:55:54 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: > This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into tkr-nmethod-keep-alive - Comment updates - Merge branch 'master' into tkr-nmethod-keep-alive - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal ------------- Changes: https://git.openjdk.org/jdk/pull/17714/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17714&range=03 Stats: 94 lines in 7 files changed: 73 ins; 8 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/17714.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17714/head:pull/17714 PR: https://git.openjdk.org/jdk/pull/17714 From never at openjdk.org Mon Apr 15 17:55:55 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 17:55:55 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v3] In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 16:25:04 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request incrementally with one additional commit since the last revision: > > Comment updates I resync'ed with master and mach5 testing was clean. I'll ping some likely reviewers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17714#issuecomment-2057493068 From hohensee at amazon.com Mon Apr 15 17:56:35 2024 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 15 Apr 2024 17:56:35 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn Message-ID: <257FD56D-830D-4AD7-8457-04B4DD3C9731@amazon.com> Vote: yes From: hotspot-dev on behalf of Thomas Stuefe Date: Thursday, April 11, 2024 at 6:25?AM To: "hotspot-dev at openjdk.org" Subject: CFV: New HotSpot Group Member: Andrew Dinn Hi, I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. Votes are due by April 25, 2024. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Cheers, Thomas [1] https://github.com/openjdk/jdk/commits/master/?author=adinn [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot [3] https://openjdk.org/census#members [4] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgibbons at openjdk.org Mon Apr 15 18:00:27 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 18:00:27 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v16] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Rename UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} (#19) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/405e4e05..95b0a345 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=14-15 Stats: 359 lines in 18 files changed: 0 ins; 184 del; 175 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From kvn at openjdk.org Mon Apr 15 18:09:05 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 15 Apr 2024 18:09:05 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:55:54 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into tkr-nmethod-keep-alive > - Comment updates > - Merge branch 'master' into tkr-nmethod-keep-alive > - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod > - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal Seems fine. May be we should consider specialized `jvmciJavaThread` subclass to keep all JVMCI fields instead of having them in `JavaThread` even when Graal is not used. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17714#pullrequestreview-2001846563 From sgibbons at openjdk.org Mon Apr 15 18:14:28 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 18:14:28 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v17] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with three additional commits since the last revision: - Set memory test (#22) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill - Set memory test (#21) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill - Only add a memory mark for byte unaligned fill * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/95b0a345..80b5a0ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=15-16 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From kvn at openjdk.org Mon Apr 15 18:15:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 15 Apr 2024 18:15:00 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:52:50 GMT, Tom Rodriguez wrote: >> Tom Rodriguez has updated the pull request incrementally with one additional commit since the last revision: >> >> Comment updates > > I resync'ed with master and mach5 testing was clean. I'll ping some likely reviewers. @tkrodriguez did you consider an nmethod's state similar to `not_installed` to avoid this issue? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17714#issuecomment-2057524350 From sgibbons at openjdk.org Mon Apr 15 18:26:29 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 18:26:29 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v18] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <2VVnuafwN24kMcJV42NKZNskaYtefC4tqKLf6So5D_E=.2f46075b-a5f1-4f09-80fa-60f0aa28511c@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Set memory test (#23) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill * Remove MUSL_LIBC ifdef * Remove MUSL_LIBC ifdef ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/80b5a0ca..856464e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=16-17 Stats: 50 lines in 1 file changed: 0 ins; 49 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From eosterlund at openjdk.org Mon Apr 15 18:34:10 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 15 Apr 2024 18:34:10 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:55:54 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into tkr-nmethod-keep-alive > - Comment updates > - Merge branch 'master' into tkr-nmethod-keep-alive > - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod > - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17714#pullrequestreview-2001888480 From dnsimon at openjdk.org Mon Apr 15 18:38:45 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 15 Apr 2024 18:38:45 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:55:54 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into tkr-nmethod-keep-alive > - Comment updates > - Merge branch 'master' into tkr-nmethod-keep-alive > - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod > - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 822: > 820: BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); > 821: > 822: err_msg msg(""); delete `err_msg msg("");` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17714#discussion_r1566272519 From sgibbons at openjdk.org Mon Apr 15 18:39:20 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 18:39:20 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v19] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <79C5LoXMAaCUiL1OY41eVXLxm_ZCDzoLoeIeNXjLo6s=.f5264a33-de11-4b51-8fa5-2455b314224d@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: - Merge branch 'openjdk:master' into setMemory - Set memory test (#23) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill * Remove MUSL_LIBC ifdef * Remove MUSL_LIBC ifdef - Set memory test (#22) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill - Set memory test (#21) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill - Only add a memory mark for byte unaligned fill * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill - Rename UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} (#19) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} - Change fill routines * Even more review comments * Re-write of atomic copy loops - Even more review comments - Addressing yet more review comments - Addressing more review comments - ... and 20 more: https://git.openjdk.org/jdk/compare/140f5671...116d7dd6 ------------- Changes: https://git.openjdk.org/jdk/pull/18555/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=18 Stats: 635 lines in 37 files changed: 418 ins; 6 del; 211 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Mon Apr 15 18:43:24 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 18:43:24 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v20] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix memory mark after sync to upstream ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/116d7dd6..113aa90f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=18-19 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From shade at openjdk.org Mon Apr 15 18:47:00 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 18:47:00 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 14:53:12 GMT, Patricio Chilano Mateo wrote: >> There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > take ResourceMark out of debug only Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18632#pullrequestreview-2001913050 From shade at openjdk.org Mon Apr 15 18:47:01 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 Apr 2024 18:47:01 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:36:26 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/interpreter/oopMapCache.cpp line 184: >> >>> 182: } >>> 183: >>> 184: InterpreterOopMap::~InterpreterOopMap() { >> >> Question: If we remove this opportunistic cleanup of `_bit_mask`, does it mean we might introduce memory inefficiencies in cases where `InterpreterOopMap` is not covered by close-by `ResourceMark`? > > In theory yes, although I doubt it is an actual issue because allocating in the resource area for this _bit_mask field is a rare case, and the memory allocated will be a few bytes. But I guess we can keep the eager cleanup just in case since it doesn't hurt. All right, your call. FWIW, this opportunistic cleanup is ugly, and I am happy to see it go. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1566283171 From never at openjdk.org Mon Apr 15 18:58:09 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 18:58:09 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 18:34:36 GMT, Doug Simon wrote: >> Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Merge branch 'master' into tkr-nmethod-keep-alive >> - Comment updates >> - Merge branch 'master' into tkr-nmethod-keep-alive >> - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod >> - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal > > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 822: > >> 820: BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); >> 821: >> 822: err_msg msg(""); > > delete `err_msg msg("");` It's necessary to allocate an empty error buffer for use by the verify_barrier code. Is there some better way to express this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17714#discussion_r1566291468 From never at openjdk.org Mon Apr 15 18:58:09 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 18:58:09 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:55:54 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into tkr-nmethod-keep-alive > - Comment updates > - Merge branch 'master' into tkr-nmethod-keep-alive > - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod > - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal You mean to add a level of indirection for the JVMCI specific fields? Some of them would be fine with that but a few really want direct access. We'd have to rework some interpreter assembly and code generation on the Graal side for those fields that accessed directly. I could look into it at some point. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17714#issuecomment-2057587218 From dnsimon at openjdk.org Mon Apr 15 18:58:09 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 15 Apr 2024 18:58:09 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 18:51:34 GMT, Tom Rodriguez wrote: >> src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 822: >> >>> 820: BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); >>> 821: >>> 822: err_msg msg(""); >> >> delete `err_msg msg("");` > > It's necessary to allocate an empty error buffer for use by the verify_barrier code. Is there some better way to express this? Oh, I missed that `msg` is used below. I don't know of a better way to do this. Maybe just add a comment: // an empty error buffer for use by the verify_barrier code ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17714#discussion_r1566296819 From never at openjdk.org Mon Apr 15 18:58:10 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 18:58:10 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v2] In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 09:13:10 GMT, Doug Simon wrote: >> Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - Merge branch 'master' into tkr-nmethod-keep-alive >> - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod >> - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal > > src/hotspot/share/jvmci/jvmciEnv.hpp line 391: > >> 389: CodeBlob* get_code_blob(JVMCIObject code); >> 390: >> 391: // Given an instance of HotSpotInstalledCode return the corresponding nmethod. > > Can you please improve the comment while here, adding a `,` after `HotSpotInstalledCode`. ok > src/hotspot/share/runtime/javaThread.hpp line 384: > >> 382: oop _jvmci_reserved_oop0; >> 383: >> 384: // This field is used to keep an nmethod visible to the GC so that it can be kept alive > > suggestion: `so that it can be kept alive` -> `so that it and its contained oops can be kept alive` ok ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17714#discussion_r1566294177 PR Review Comment: https://git.openjdk.org/jdk/pull/17714#discussion_r1566295509 From never at openjdk.org Mon Apr 15 19:30:26 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 19:30:26 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v5] In-Reply-To: References: Message-ID: > This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. Tom Rodriguez has updated the pull request incrementally with one additional commit since the last revision: Update some comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17714/files - new: https://git.openjdk.org/jdk/pull/17714/files/3d2417e7..a047272b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17714&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17714&range=03-04 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17714.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17714/head:pull/17714 PR: https://git.openjdk.org/jdk/pull/17714 From never at openjdk.org Mon Apr 15 19:30:30 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 19:30:30 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:55:54 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into tkr-nmethod-keep-alive > - Comment updates > - Merge branch 'master' into tkr-nmethod-keep-alive > - Move BarrierSetNMethod call into JVMCINMethodHandle::set_nmethod > - 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal Regarding the comment about the special state, the problem is that the nmethod must be visited by the GC while it's alive and the only way to ensure that for it to be explicitly visited during GC. There's no way to fixup the nmethod after the fact. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17714#issuecomment-2057648517 From never at openjdk.org Mon Apr 15 19:30:30 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 15 Apr 2024 19:30:30 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v4] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 18:55:57 GMT, Doug Simon wrote: >> It's necessary to allocate an empty error buffer for use by the verify_barrier code. Is there some better way to express this? > > Oh, I missed that `msg` is used below. I don't know of a better way to do this. Maybe just add a comment: > > // an empty error buffer for use by the verify_barrier code ok ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17714#discussion_r1566325813 From dlong at openjdk.org Mon Apr 15 20:00:01 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 15 Apr 2024 20:00:01 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v5] In-Reply-To: <3KD8xa04p5mqN6w_xDjSUFGGrl5ybGIt0w9xOjQo5oo=.cf77f2a5-023d-428b-9846-04ac54cd03df@github.com> References: <3KD8xa04p5mqN6w_xDjSUFGGrl5ybGIt0w9xOjQo5oo=.cf77f2a5-023d-428b-9846-04ac54cd03df@github.com> Message-ID: <4L-lOSjWqbJEj_tkHhX_tPC7g1De_N38bRLuvCaWdyc=.a888d820-0f44-4565-8322-6415a742fcef@github.com> On Mon, 15 Apr 2024 07:59:07 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [ ] Linux x86_64 server fastdebug, `all` >> - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [ ] Linux x86_64 server fastdebug, Maven CTW >> - [ ] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision: > > - Also tests > - Drop the other check; dodge UB There's probably a way to use std::enable_if here so the floating-point version has no checks, but I assume it's not needed because the C++ compiler optimizes out the check. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18751#pullrequestreview-2002039657 From vkempik at openjdk.org Mon Apr 15 20:36:42 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 15 Apr 2024 20:36:42 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> Message-ID: On Mon, 15 Apr 2024 16:25:14 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. >> Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 >> >> Thanks > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > refine code src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1177: > 1175: // don't want non-IEEE rounding modes. > 1176: guarantee(RoundingMode::rne == 0, "must be"); > 1177: beq(tmp, zr, skip_fsrmi); // Only reset FRM if it's wrong is it really better (performance wise) than doing it always, unconditionaly (so minus frrm, minus beq) ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1566395569 From vlivanov at openjdk.org Mon Apr 15 20:52:02 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 15 Apr 2024 20:52:02 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: On Mon, 15 Apr 2024 13:49:20 GMT, Andrew Haley wrote: >> src/hotspot/share/cds/filemap.hpp line 274: >> >>> 272: bool compressed_oops() const { return _compressed_oops; } >>> 273: bool compressed_class_pointers() const { return _compressed_class_ptrs; } >>> 274: bool use_secondary_supers_table() const { return _use_secondary_supers_table; } >> >> Do we really need this accessor which is used only in one place? > > @iwanowww , this one is yours. May I nuke this method? Sure. I don't have a strong opinion about it. It's cleaner to access fields through the accessor, but I agree that it doesn't add much value in its current form. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1566422162 From vlivanov at openjdk.org Mon Apr 15 21:03:03 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 15 Apr 2024 21:03:03 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: Message-ID: <2ReTrE0inVkfcPNrq6JVrGRkoFuOZsLK6Ir0ZAnd_Kk=.13903f65-b747-4c38-9572-91e132ebd424@github.com> On Mon, 15 Apr 2024 15:08:31 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Marked as reviewed by vlivanov (Reviewer). Performance testing results look fine. ------------- PR Review: https://git.openjdk.org/jdk/pull/18309#pullrequestreview-2002156872 PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2057795766 From vlivanov at openjdk.org Mon Apr 15 21:03:03 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 15 Apr 2024 21:03:03 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v14] In-Reply-To: References: <3pJmRUuwQ_8y_uqDiaASd2YbpWOHv1MIWmhjTSL-Oj8=.677e4f4f-a0ea-4e35-aab8-d85ac42aa5ef@github.com> Message-ID: <4ZRcdEy3Czx1EY7WjIM7M1J-PQeL_p3IyUlNPE70GWg=.e47a2e65-e3b9-4524-9c0f-2743b42f7ad9@github.com> On Mon, 15 Apr 2024 12:36:44 GMT, Andrew Haley wrote: >> src/hotspot/cpu/x86/stubRoutines_x86.hpp line 41: >> >>> 39: // Windows have more code to save/restore registers >>> 40: _compiler_stubs_code_size = 20000 LP64_ONLY(+39000) WINDOWS_ONLY(+2000), >>> 41: _final_stubs_code_size = 10000 LP64_ONLY(+20000) WINDOWS_ONLY(+2000) ZGC_ONLY(+24000) >> >> Do we still need it after you moved code to compiler stubs section? > > It's a bug in that the 20000 byte figure is an underestimate for what zgc stubs need. I could take it out for this patch, I guess, but it'd still be a bug. IMO it's better to handle it separately. It'll make it easier to backport the fix if needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1566432954 From jsjolen at openjdk.org Mon Apr 15 21:08:01 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 15 Apr 2024 21:08:01 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v32] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with five additional commits since the last revision: - Remove merge and style - Make IntervalState 8 bytes smaller - Rename TreapCHeap:tree to root, as it is a pointer to the root - Style - Comment out ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/b97d3282..5a3e6dd6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=31 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=30-31 Stats: 84 lines in 5 files changed: 39 ins; 5 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Mon Apr 15 21:14:46 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 15 Apr 2024 21:14:46 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v32] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 21:08:01 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with five additional commits since the last revision: > > - Remove merge and style > - Make IntervalState 8 bytes smaller > - Rename TreapCHeap:tree to root, as it is a pointer to the root > - Style > - Comment out I've removed the `merge()` functionality as I'm expecting Afshin's PR to get approved which will render adapting flags unnecessary. I did the most obvious size optimization such that each in/out state only takes 16 bytes instead of 24. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2057817333 From duke at openjdk.org Mon Apr 15 22:04:08 2024 From: duke at openjdk.org (Mikhail Ablakatov) Date: Mon, 15 Apr 2024 22:04:08 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 Message-ID: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Hello, Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. # Performance ## Neoverse N1 -------------------------------------------------------------------------------------------- Version Baseline This patch -------------------------------------------------------------------------------------------- Benchmark (size) Mode Cnt Score Error Score Error Units -------------------------------------------------------------------------------------------- ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op ArraysHashCode.multibytes 10 avgt 15 5.481 ? 0.001 3.136 ? 0.001 ns/op ArraysHashCode.multibytes 100 avgt 15 50.950 ? 0.006 15.277 ? 0.007 ns/op ArraysHashCode.multibytes 10000 avgt 15 5335.181 ? 0.692 1340.850 ? 4.291 ns/op ArraysHashCode.multichars 1 avgt 15 1.038 ? 0.001 1.037 ? 0.001 ns/op ArraysHashCode.multichars 10 avgt 15 5.480 ? 0.001 3.783 ? 0.001 ns/op ArraysHashCode.multichars 100 avgt 15 50.955 ? 0.006 13.890 ? 0.018 ns/op ArraysHashCode.multichars 10000 avgt 15 5338.597 ? 0.853 1335.599 ? 0.652 ns/op ArraysHashCode.multiints 1 avgt 15 1.042 ? 0.001 1.043 ? 0.001 ns/op ArraysHashCode.multiints 10 avgt 15 5.526 ? 0.001 3.866 ? 0.001 ns/op ArraysHashCode.multiints 100 avgt 15 50.917 ? 0.005 14.918 ? 0.026 ns/op ArraysHashCode.multiints 10000 avgt 15 5348.365 ? 5.836 1287.685 ? 1.083 ns/op ArraysHashCode.multishorts 1 avgt 15 1.036 ? 0.001 1.037 ? 0.001 ns/op ArraysHashCode.multishorts 10 avgt 15 5.480 ? 0.001 3.783 ? 0.001 ns/op ArraysHashCode.multishorts 100 avgt 15 50.975 ? 0.034 13.890 ? 0.015 ns/op ArraysHashCode.multishorts 10000 avgt 15 5338.790 ? 1.276 1337.034 ? 1.600 ns/op ArraysHashCode.shorts 1 avgt 15 1.187 ? 0.001 1.187 ? 0.001 ns/op ArraysHashCode.shorts 10 avgt 15 8.731 ? 0.002 5.342 ? 0.001 ns/op ArraysHashCode.shorts 100 avgt 15 98.544 ? 0.013 23.017 ? 0.141 ns/op ArraysHashCode.shorts 10000 avgt 15 10148.275 ? 1.119 2408.041 ? 1.478 ns/op ## Neoverse N2, Neoverse V1 Performance metrics have been collected for these cores as well. They are similar to the results above and can be posted upon request. # Test Full jtreg passed on AArch64 and x86. ------------- Commit messages: - 8322770: AArch64: C2: Implement VectorizedHashCode Changes: https://git.openjdk.org/jdk/pull/18487/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18487&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322770 Stats: 264 lines in 4 files changed: 263 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18487.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18487/head:pull/18487 PR: https://git.openjdk.org/jdk/pull/18487 From dchuyko at openjdk.org Mon Apr 15 22:04:08 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Mon, 15 Apr 2024 22:04:08 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... Just a trivial note: this change also improves the calculation of String.hashCode(). For instance, on V1 Benchmark size Improvement StringHashCode.Algorithm.defaultLatin1 1 -2.86% StringHashCode.Algorithm.defaultLatin1 10 45.84% StringHashCode.Algorithm.defaultLatin1 100 79.43% StringHashCode.Algorithm.defaultLatin1 10000 79.16% StringHashCode.Algorithm.defaultUTF16 1 -1.57% StringHashCode.Algorithm.defaultUTF16 10 41.83% StringHashCode.Algorithm.defaultUTF16 100 80.01% StringHashCode.Algorithm.defaultUTF16 10000 78.44% SVE can give notable additional speedup only for very long strings (>1k). src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 65: > 63: : eltype == T_CHAR || eltype == T_SHORT || eltype == T_INT ? 4 > 64: : 0; > 65: guarantee(loop_factor, "unsopported eltype"); typo: unsupported src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 102: > 100: * Pseudocode: > 101: * > 102: * cnt -= unroll_facotor + 1 - loop_factor; typo: factor ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2024948481 PR Review Comment: https://git.openjdk.org/jdk/pull/18487#discussion_r1542839364 PR Review Comment: https://git.openjdk.org/jdk/pull/18487#discussion_r1543169690 From duke at openjdk.org Mon Apr 15 22:12:30 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Mon, 15 Apr 2024 22:12:30 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3] In-Reply-To: References: Message-ID: <-64Xlhk6ln43-xTmlv_cvloS-gzDrKMyiPUdPbMNlIM=.2b524654-ca5b-4a7a-a7da-316e99cfea35@github.com> > Performance. Before: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s > > Performance, no intrinsic: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply true thrpt 3 1919.574 ? 10.591 ops/s > > Performance, **with intrinsics*... Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: Comments from Jatin and Tony ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18583/files - new: https://git.openjdk.org/jdk/pull/18583/files/82b6dae7..6f9ac046 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=01-02 Stats: 97 lines in 20 files changed: 4 ins; 36 del; 57 mod Patch: https://git.openjdk.org/jdk/pull/18583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18583/head:pull/18583 PR: https://git.openjdk.org/jdk/pull/18583 From duke at openjdk.org Mon Apr 15 22:12:30 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Mon, 15 Apr 2024 22:12:30 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: <48md2WEAhqPyuVf4AYOxBQDykUiOaEL0PQb-ki0_TYM=.6c25bf41-b0ae-49ec-b606-236deb4561e3@github.com> References: <48md2WEAhqPyuVf4AYOxBQDykUiOaEL0PQb-ki0_TYM=.6c25bf41-b0ae-49ec-b606-236deb4561e3@github.com> Message-ID: On Wed, 10 Apr 2024 23:56:52 GMT, Volodymyr Paprotski wrote: > Few early comments. > > Please update the copyright year of all the modified files. > > You can even consider splitting this into two patches, Java side changes in one and x86 optimized intrinsic in next one. Fixed all copyright years git diff da8a095a19c90e7ee2b45fab9b533a1092887023 | lsdiff -p1 | while read line; do echo $line =========================; grep Copyright $line | grep -v 2024; done | less Re splitting.. probably too late for that now.. (did consider it initially.. got hard to manage two changes while developing. And easier to justify the change when the entire patch is presented? but yes, far more code to review.. ) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2057892691 From duke at openjdk.org Mon Apr 15 22:12:30 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Mon, 15 Apr 2024 22:12:30 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 17:15:21 GMT, Anthony Scarpino wrote: >>> In `ECOperations.java`, if I understand this correctly, it is to replace the existing `PointMultiplier` with montgomery-based PointMuliplier. But when I look at the code, I see both are still options. If I read this correctly, it checks for the old `IntegerFieldModuloP`, then looks for the new `IntegerMontgomeryFieldModuloP`. It appears to use the new one always. Why doesn't it just replace the old implementation entry in the `fields` Map? Is there a reason to keep it around? >> >> Hmm, thats a good point I haven't fully considered; i.e. (if I read correctly) "for `CurveDB.P_256` remove the fallback path to non-montgomery entirely".. that might also help in cleaning a few things up in the construction. Maybe even get rid of this nested ECOperations inside ECOperations.. Perhaps nesting isnt a big deal, but all attempts to make the ECC stack clearer is positive! >> >> One functional reason that might justify keeping it as-is, is fuzz-testing; with the fallback available, I am able to write the included Fuzz tests and have them check the values against the existing implementation. While I also included a few KAT tests using openssl-generated values, the fuzz tests check millions of values and it does add a lot more certainty about correctness of this code. >> >> Can it be removed? For the operations that do not involve multiplication (i.e. `setSum(*)`), montgomery is expensive. I think I did go through the uses of this code some time back (i.e. ECDHE, ECDSA and KeyGeneration) and existing IntegerPolynomialP256 is no longer used (I should verify that again) and only P256OrderField remains non-montgomery. So removing references to IntegerPolynomialP256 in ECOperations should be possible and cleaner. Removing IntegerPolynomialP256 from MontgomeryIntegerPolynomialP256 is harder (fromMontgomery() uses IntegerPolynomialP256) but perhaps also worth some thought.. >> >> I tend to like `ECOperationsFuzzTest.java` and would prefer to keep it, but it could also be chucked up as part of 'scaffolding' and removed in name of code quality? >> >> Thanks @ascarpino >> >> PS: Perhaps there is some middle ground, remove the `ECOperations montgomeryOps` nesting, and construct (somehow?? singleton makes most things inaccessible..) the reference ECOperations in the fuzz test instead.. not sure how yet, but perhaps worth a further thought.. > >> > In `ECOperations.java`, if I understand this correctly, it is to replace the existing `PointMultiplier` with montgomery-based PointMuliplier. But when I look at the code, I see both are still options. If I read this correctly, it checks for the old `IntegerFieldModuloP`, then looks for the new `IntegerMontgomeryFieldModuloP`. It appears to use the new one always. Why doesn't it just replace the old implementation entry in the `fields` Map? Is there a reason to keep it around? >> >> Hmm, thats a good point I haven't fully considered; i.e. (if I read correctly) "for `CurveDB.P_256` remove the fallback path to non-montgomery entirely".. that might also help in cleaning a few things up in the construction. Maybe even get rid of this nested ECOperations inside ECOperations.. Perhaps nesting isnt a big deal, but all attempts to make the ECC stack clearer is positive! >> >> One functional reason that might justify keeping it as-is, is fuzz-testing; with the fallback available, I am able to write the included Fuzz tests and have them check the values against the existing implementation. While I also included a few KAT tests using openssl-generated values, the fuzz tests check millions of values and it does add a lot more certainty about correctness of this code. > > I hadn't looked at your fuzz test until you mentioned it. I see you are using reflection to change the values. Is that what you mean by "fallback"? I'm assuming there is no to access the older implementation without reflection. > >> >> Can it be removed? For the operations that do not involve multiplication (i.e. `setSum(*)`), montgomery is expensive. I think I did go through the uses of this code some time back (i.e. ECDHE, ECDSA and KeyGeneration) and existing IntegerPolynomialP256 is no longer used (I should verify that again) and only P256OrderField remains non-montgomery. So removing references to IntegerPolynomialP256 in ECOperations should be possible and cleaner. Removing IntegerPolynomialP256 from MontgomeryIntegerPolynomialP256 is harder (fromMontgomery() uses IntegerPolynomialP256) but perhaps also worth some thought.. >> >> I tend to like `ECOperationsFuzzTest.java` and would prefer to keep it, but it could also be chucked up as part of 'scaffolding' and removed in name of code quality? > > I wouldn't rip out the old implementation. I have been wondering if we should make the older implementation available, maybe by security property. I was looking at the static Maps at the top of `ECOperatio... @ascarpino Fixed as suggested... actually.. that was _waaay_ easier then I thought it would be (I saw singleton and assumed private constructor.. nope, ECOperations() is public, no reflection required!! Ended up with cleaner implementation _and_ cleaner tests! Thanks!) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18583#issuecomment-2057895950 From duke at openjdk.org Mon Apr 15 22:12:31 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Mon, 15 Apr 2024 22:12:31 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Fri, 5 Apr 2024 07:19:28 GMT, Jatin Bhateja wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> remove use of jdk.crypto.ec > > src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 39: > >> 37: }; >> 38: static address modulus_p256() { >> 39: return (address)MODULUS_P256; > > Long constants should have UL suffix. Properly ULL, but good point, fixed > src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 386: > >> 384: __ jcc(Assembler::equal, L_Length19); >> 385: >> 386: // Default copy loop > > Please add appropriate loop entry alignment. This is actually a 'switch statement default'. The default should never happen (See "Known Length comment on line 335"), but added because java code has that behavior. (i.e. in the unlikely case NIST adds a new elliptic curve to the existing standard?) > src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 394: > >> 392: __ lea(aLimbs, Address(aLimbs,8)); >> 393: __ lea(bLimbs, Address(bLimbs,8)); >> 394: __ jmp(L_DefaultLoop); > > Both sub and cmp are flag affecting instructions and are macro-fusible. > By doing a loop rotation i.e. moving the length <= 0 check outside the loop and pushing the loop exit check at bottom you can save additional compare checks. Per-above, this is a switch statement (`UNLIKELY`) fallback. I can still add alignment and loop rotation, but being a fallback figured its more important to keep it small&readable... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486768 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486717 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486848 From sviswanathan at openjdk.org Mon Apr 15 22:35:02 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 15 Apr 2024 22:35:02 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v20] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 15 Apr 2024 18:43:24 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix memory mark after sync to upstream These are my last set of comments. Rest looks good to me. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2588: > 2586: StubCodeMark mark(this, "StubRoutines", name); > 2587: address start = __ pc(); > 2588: We are missing the __ enter() here? src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2680: > 2678: __ BIND(L_fillBytes); > 2679: { > 2680: const Register byteVal = rdx; This could be removed. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2686: > 2684: __ movq(rdx, rsi); > 2685: restore_arg_regs(); > 2686: #endif This is stubGenerator_x86_64.cpp 64bit specific, so WIN32 portion could be removed? ------------- PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-1998540097 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1566504104 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1566498612 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1566498342 From sviswanathan at openjdk.org Mon Apr 15 22:35:02 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 15 Apr 2024 22:35:02 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v14] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 12 Apr 2024 16:47:58 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Even more review comments src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2692: > 2690: __ BIND(L_fillBytes); > 2691: } > 2692: #ifdef MUSL_LIBC The code in #Ifdef MUSL_LIBC can be removed as we are not using the libc call in the #else path. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1563287878 From jvernee at openjdk.org Mon Apr 15 23:04:02 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 15 Apr 2024 23:04:02 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v20] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 15 Apr 2024 22:22:38 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix memory mark after sync to upstream > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2686: > >> 2684: __ movq(rdx, rsi); >> 2685: restore_arg_regs(); >> 2686: #endif > > This is stubGenerator_x86_64.cpp 64bit specific, so WIN32 portion could be removed? `_WIN32` is also defined for 64 bit Windows ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1566523189 From sviswanathan at openjdk.org Mon Apr 15 23:09:01 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 15 Apr 2024 23:09:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v20] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 15 Apr 2024 23:01:21 GMT, Jorn Vernee wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2686: >> >>> 2684: __ movq(rdx, rsi); >>> 2685: restore_arg_regs(); >>> 2686: #endif >> >> This is stubGenerator_x86_64.cpp 64bit specific, so WIN32 portion could be removed? > > `_WIN32` is also defined for 64 bit Windows Thanks, I missed that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1566525698 From sgibbons at openjdk.org Mon Apr 15 23:16:01 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 15 Apr 2024 23:16:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v20] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Mon, 15 Apr 2024 23:06:11 GMT, Sandhya Viswanathan wrote: >> `_WIN32` is also defined for 64 bit Windows > > Thanks, I missed that. I'm changing the scheme here to c_rargX, so no need for any windows-specific stuff. I added it when I needed 4 scratch registers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1566529530 From sgibbons at openjdk.org Tue Apr 16 00:04:15 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 16 Apr 2024 00:04:15 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Add enter() and leave(); remove Windows-specific register stuff ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/113aa90f..7a1d67e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=19-20 Stats: 38 lines in 1 file changed: 1 ins; 20 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From dholmes at openjdk.org Tue Apr 16 00:50:53 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 16 Apr 2024 00:50:53 GMT Subject: RFR: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 [v6] In-Reply-To: References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: <2JPEjqINrT3dq6510a2UKRl3FifyTvO7JmKOaewZ_20=.03a2bc1e-9c67-498a-87ae-c5e41e6f6e78@github.com> On Thu, 11 Apr 2024 01:27:15 GMT, David Holmes wrote: >> The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. >> >> The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. >> >> The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. >> >> Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. >> >> I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes >> >> The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. >> >> Testing: >> - regression test 10x on all x64 and aarch64 platforms >> - tiers 1-4 >> - GHA >> >> >> Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. >> >> Thanks to @fbredber for the Aarch64 and RISCV asm code. >> >> Thanks > > David Holmes has updated the pull request incrementally with one additional commit since the last revision: > > Fix typos, copyrights and add comment Thanks for all the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18445#issuecomment-2058043987 From dholmes at openjdk.org Tue Apr 16 00:50:54 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 16 Apr 2024 00:50:54 GMT Subject: Integrated: 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 In-Reply-To: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> References: <99VqHk5cN-SmHeKf744rTx6shwpp0IqVZHxJpcCFnL8=.6e13979d-b35a-42f6-955b-6fd391c343a1@github.com> Message-ID: On Fri, 22 Mar 2024 06:26:03 GMT, David Holmes wrote: > The crux of the problem here is that the virtual thread code was not keeping the held-monitor-count and jni-monitor-count in sync under all conditions. So if a vthread acquired a monitor via JNI but failed to unlock it before terminating, the underlying platform thread's counts were out of sync and if it terminated we would trigger the assertion that checks for such things. > > The actual fix is very simple: we zero the platform thread's jni-monitor-count in `continuation_enter_cleanup` the same way we zero the held-monitor-count. In addition we apply the same `CheckJNICalls` check for this unbalanced locking and issue a warning in the virtual thread case. That fact this happens in asm code complicates matters. > > The existing `JNIMonitor.java` test is greatly expanded to test these scenarios and check the unified logging output. > > Other minor changes involve expanding some of the other assertions relating to the two counts so we can detected a mismatch earlier without a need for the thread to terminate. And the test that original uncovered the problem (`GetOwnedMonitorInfoTest.java`) has some minor adjustments to enhance diagnostics. > > I've provided the fix for all architectures that support continuations: x64, aarch64, riscv and ppc. The latter both build okay in GHA but I can't actually test them with the updated test. So some assistance from RISCV folk (@robehn ?) and PPC folk (??) would be appreciated (otherwise any issues will have to be handled as follow up fixes > > The changes are structured so that there is no extra code executed in product builds unless `CheckJNICalls` is set. This means that product builds will not keep the JNI count in sync with the held count, unless `CheckJNICalls` is set. This could trip up a future logging entry or explicit check of the JNI count, but it is expected that these counts will be removed once ObjectMonitor usage will not force virtual thread pinning. > > Testing: > - regression test 10x on all x64 and aarch64 platforms > - tiers 1-4 > - GHA > > > Thanks to @pchilano for help working out the best form of the fix and the initial asm for x64. > > Thanks to @fbredber for the Aarch64 and RISCV asm code. > > Thanks This pull request has now been integrated. Changeset: 274c805c Author: David Holmes URL: https://git.openjdk.org/jdk/commit/274c805c5137d9080a7670d864ecca8a0befc3f6 Stats: 504 lines in 10 files changed: 460 ins; 13 del; 31 mod 8327743: JVM crash in hotspot/share/runtime/javaThread.cpp - failed: held monitor count should be equal to jni: 0 != 1 Co-authored-by: Fredrik Bredberg Reviewed-by: rehn, fbredberg, pchilanomate, rrich ------------- PR: https://git.openjdk.org/jdk/pull/18445 From dlong at openjdk.org Tue Apr 16 01:34:03 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 01:34:03 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/code/codeBlob.cpp line 88: > 86: S390_ONLY(_ctable_offset(0) COMMA) > 87: _header_size((uint16_t)header_size), > 88: _frame_complete_offset((int16_t)frame_complete_offset), Rather than a raw cast, it would be better to use checked_cast here, or better yet, change the incoming parameter types to match the field type. That way, if the caller is passing a constant, the compiler can check it at compile time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566601934 From dlong at openjdk.org Tue Apr 16 01:37:45 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 01:37:45 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/code/codeBlob.cpp line 120: > 118: > 119: _header_size((uint16_t)header_size), > 120: _frame_complete_offset((int16_t)CodeOffsets::frame_never_safe), See above. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566604310 From duke at openjdk.org Tue Apr 16 02:04:06 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 16 Apr 2024 02:04:06 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 17:16:24 GMT, Andrew Haley wrote: > Hi, I guess this isn't quite ready for review let. I'll have another look whan it is. Is there any other gap I'm not aware? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2058099931 From kvn at openjdk.org Tue Apr 16 02:09:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 02:09:01 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 01:30:50 GMT, Dean Long wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Union fields which usages do not overlap > > src/hotspot/share/code/codeBlob.cpp line 88: > >> 86: S390_ONLY(_ctable_offset(0) COMMA) >> 87: _header_size((uint16_t)header_size), >> 88: _frame_complete_offset((int16_t)frame_complete_offset), > > Rather than a raw cast, it would be better to use checked_cast here, or better yet, change the incoming parameter types to match the field type. That way, if the caller is passing a constant, the compiler can check it at compile time. Agree and will do. In all case `sizeof(_Class_)` is used for `header_size`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566620168 From dlong at openjdk.org Tue Apr 16 02:31:00 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 02:31:00 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/code/nmethod.hpp line 282: > 280: _has_flushed_dependencies:1, // Used for maintenance of dependencies (under CodeCache_lock) > 281: _is_unlinked:1, // mark during class unloading > 282: _load_reported:1; // used by jvmti to track if an event has been posted for this nmethod It seems like the type could be changed from uint8_t to bool. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566631312 From jbhateja at openjdk.org Tue Apr 16 02:31:02 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 16 Apr 2024 02:31:02 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 22:04:14 GMT, Volodymyr Paprotski wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 394: >> >>> 392: __ lea(aLimbs, Address(aLimbs,8)); >>> 393: __ lea(bLimbs, Address(bLimbs,8)); >>> 394: __ jmp(L_DefaultLoop); >> >> Both sub and cmp are flag affecting instructions and are macro-fusible. >> By doing a loop rotation i.e. moving the length <= 0 check outside the loop and pushing the loop exit check at bottom you can save additional compare checks. > > Per-above, this is a switch statement (`UNLIKELY`) fallback. I can still add alignment and loop rotation, but being a fallback figured its more important to keep it small&readable... It's all part of intrinsic, no harm in polishing it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566630667 From dlong at openjdk.org Tue Apr 16 02:36:59 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 02:36:59 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/code/nmethod.hpp line 205: > 203: // offsets to find the receiver for non-static native wrapper frames. > 204: ByteSize _native_receiver_sp_offset; > 205: ByteSize _native_basic_lock_sp_offset; Don't we need an assert in the accessor functions to make sure nmethod is native or not? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566634384 From dlong at openjdk.org Tue Apr 16 02:53:59 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 02:53:59 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 09:15:44 GMT, Axel Boldt-Christmas wrote: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Running JVMTI reproducer. Starting full testing soon. Can you describe the transitional state this fix avoids, and why it only needs to deal with monitorenter from a synchronized method prologue, and not also monitorenter from synchronized blocks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2058136922 From dlong at openjdk.org Tue Apr 16 02:57:03 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 02:57:03 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. In-Reply-To: References: Message-ID: <4JWMKweoz_X2evkhBxecES_zBQuPs2vUkdSCbbzwAtc=.08d35c24-2f8e-4536-a251-3088fa47f25c@github.com> On Tue, 16 Apr 2024 02:50:41 GMT, Dean Long wrote: > Can you describe the transitional state this fix avoids, and why it only needs to deal with monitorenter from a synchronized method prologue, and not also monitorenter from synchronized blocks. Nevermind, I see that it does handle monitorenter for synchronized methods. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2058139613 From dlong at openjdk.org Tue Apr 16 03:07:59 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 03:07:59 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 09:15:44 GMT, Axel Boldt-Christmas wrote: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Running JVMTI reproducer. Starting full testing soon. src/hotspot/share/runtime/deoptimization.cpp line 451: > 449: if (!is_syncronized_entry && bc != Bytecodes::Code::_monitorenter) { > 450: deoptee_thread->lock_stack().verify_consistent_lock_order(lock_order, exec_mode != Deoptimization::Unpack_none); > 451: } The above checks would also hit the follow false positives: 1. deopt in counter overflow in prologue, not in monitorenter 2. monitorenter at bci 0 when raw_bci is -1 (assuming it got past the verifier) but seems mostly harmless to skip checks in those cases. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1566649663 From kvn at openjdk.org Tue Apr 16 03:09:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 03:09:00 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: <64kGNHR9SmKW6rkPphO1my45Rte6w07v9V7Nf04GNN4=.0ac11f40-5e92-4367-82be-95410dca6ee5@github.com> On Tue, 16 Apr 2024 02:34:29 GMT, Dean Long wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Union fields which usages do not overlap > > src/hotspot/share/code/nmethod.hpp line 205: > >> 203: // offsets to find the receiver for non-static native wrapper frames. >> 204: ByteSize _native_receiver_sp_offset; >> 205: ByteSize _native_basic_lock_sp_offset; > > Don't we need an assert in the accessor functions to make sure nmethod is native or not? I thought about that but in both places where these accessors are called (`frame::get_native_monitor()` and `frame::get_native_receiver()`) there are such asserts already: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/frame.cpp#L1085 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566650267 From kvn at openjdk.org Tue Apr 16 03:15:45 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 03:15:45 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 02:28:14 GMT, Dean Long wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Union fields which usages do not overlap > > src/hotspot/share/code/nmethod.hpp line 282: > >> 280: _has_flushed_dependencies:1, // Used for maintenance of dependencies (under CodeCache_lock) >> 281: _is_unlinked:1, // mark during class unloading >> 282: _load_reported:1; // used by jvmti to track if an event has been posted for this nmethod > > It seems like the type could be changed from uint8_t to bool. Is there difference in generated code when you use bool instead of uint8_t? I used uint8_t to easy change to uint16_t in a future if needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566653582 From kvn at openjdk.org Tue Apr 16 03:31:25 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 03:31:25 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: Message-ID: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1322 bytecoded nmethods for C2: > total in heap = 8307120 (100%) > header = 327856 (3.946687%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Use 16-bits types for header_size and frame_complete_offset arguments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18768/files - new: https://git.openjdk.org/jdk/pull/18768/files/13744e78..6cb22e81 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=02-03 Stats: 22 lines in 2 files changed: 0 ins; 2 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From sviswanathan at openjdk.org Tue Apr 16 04:30:02 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 16 Apr 2024 04:30:02 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff PR looks good to me now. Thanks a lot for considering all the inputs. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2002553946 From ccheung at openjdk.org Tue Apr 16 04:34:59 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 16 Apr 2024 04:34:59 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 16:31:08 GMT, Ioi Lam wrote: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18728#pullrequestreview-2002558006 From dlong at openjdk.org Tue Apr 16 06:17:00 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 06:17:00 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/code/nmethod.cpp line 1235: > 1233: int skipped_insts_size = code_buffer->total_skipped_instructions_size(); > 1234: #ifdef ASSERT > 1235: assert(((skipped_insts_size >> 16) == 0), "size is bigger than 64Kb: %d", skipped_insts_size); Suggestion: I think it's simpler just to use checked_cast below. src/hotspot/share/code/nmethod.cpp line 1240: > 1238: int consts_offset = code_buffer->total_offset_of(code_buffer->consts()); > 1239: assert(consts_offset == 0, "const_offset: %d", consts_offset); > 1240: #endif Suggestion: src/hotspot/share/code/nmethod.cpp line 1241: > 1239: assert(consts_offset == 0, "const_offset: %d", consts_offset); > 1240: #endif > 1241: _skipped_instructions_size = (uint16_t)skipped_insts_size; Suggestion: _skipped_instructions_size = checked_cast(code_buffer->total_skipped_instructions_size()); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566764300 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566765068 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566759786 From dlong at openjdk.org Tue Apr 16 06:23:02 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 06:23:02 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/code/nmethod.cpp line 1441: > 1439: int deps_size = align_up((int)dependencies->size_in_bytes(), oopSize); > 1440: int sum_size = oops_size + metadata_size + deps_size; > 1441: assert((sum_size >> 16) == 0, "data size is bigger than 64Kb: %d", sum_size); I suggest using checked_cast for the assignment below, rather than special-purpose checks here. src/hotspot/share/code/nmethod.cpp line 1445: > 1443: _metadata_offset = (uint16_t)oops_size; > 1444: _dependencies_offset = _metadata_offset + (uint16_t)metadata_size; > 1445: _scopes_pcs_offset = _dependencies_offset + (uint16_t)deps_size; Use checked_cast instead of raw casts. src/hotspot/share/code/nmethod.cpp line 1459: > 1457: assert((data_offset() + data_end_offset) <= nmethod_size, "wrong nmethod's size: %d < %d", nmethod_size, (data_offset() + data_end_offset)); > 1458: > 1459: _entry_offset = (uint16_t)offsets->value(CodeOffsets::Entry); Use checked_cast. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566771026 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566772567 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566773477 From dlong at openjdk.org Tue Apr 16 06:35:02 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 06:35:02 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 03:24:07 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Union fields which usages do not overlap src/hotspot/share/memory/heap.hpp line 58: > 56: void set_length(size_t length) { > 57: LP64_ONLY( assert(((length >> 32) == 0), "sanity"); ) > 58: _header._length = (uint32_t)length; Suggestion: _header._length = checked_castlength; src/hotspot/share/memory/heap.hpp line 63: > 61: // Accessors > 62: void* allocated_space() const { return (void*)(this + 1); } > 63: size_t length() const { return (size_t)_header._length; } This cast looks unnecessary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566784458 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566784587 From dlong at openjdk.org Tue Apr 16 06:50:43 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 06:50:43 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: <64kGNHR9SmKW6rkPphO1my45Rte6w07v9V7Nf04GNN4=.0ac11f40-5e92-4367-82be-95410dca6ee5@github.com> References: <64kGNHR9SmKW6rkPphO1my45Rte6w07v9V7Nf04GNN4=.0ac11f40-5e92-4367-82be-95410dca6ee5@github.com> Message-ID: <9wT-mL_BWh583PJEdw5DjgkbvqZB5abgPYsAUJMzTHA=.f62b51c8-b8c2-47b8-bcb5-57265523c75f@github.com> On Tue, 16 Apr 2024 03:06:13 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/code/nmethod.hpp line 205: >> >>> 203: // offsets to find the receiver for non-static native wrapper frames. >>> 204: ByteSize _native_receiver_sp_offset; >>> 205: ByteSize _native_basic_lock_sp_offset; >> >> Don't we need an assert in the accessor functions to make sure nmethod is native or not? > > I thought about that but in both places where these accessors are called (`frame::get_native_monitor()` and `frame::get_native_receiver()`) there are such asserts already: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/frame.cpp#L1085 OK, but I'd rather see it in the accessors too. Some users are checking for method()->is_native() and others are checking for is_osr_method(), so we need to make sure those are always mutually exclusive: method()->is_native() != is_osr_method(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566806754 From dlong at openjdk.org Tue Apr 16 06:57:01 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 06:57:01 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 03:12:48 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/code/nmethod.hpp line 282: >> >>> 280: _has_flushed_dependencies:1, // Used for maintenance of dependencies (under CodeCache_lock) >>> 281: _is_unlinked:1, // mark during class unloading >>> 282: _load_reported:1; // used by jvmti to track if an event has been posted for this nmethod >> >> It seems like the type could be changed from uint8_t to bool. > > Is there difference in generated code when you use bool instead of uint8_t? > I used uint8_t to easy change to uint16_t in a future if needed. I don't think uint8_t vs uint16_t matters, only if it is signed, unsigned, or bool. So if we have more than 8 individual :1 fields, it will expand to a 2nd byte. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1566814258 From mli at openjdk.org Tue Apr 16 07:17:01 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 16 Apr 2024 07:17:01 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> Message-ID: <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> On Mon, 15 Apr 2024 20:33:32 GMT, Vladimir Kempik wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> refine code > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1177: > >> 1175: // don't want non-IEEE rounding modes. >> 1176: guarantee(RoundingMode::rne == 0, "must be"); >> 1177: beq(tmp, zr, skip_fsrmi); // Only reset FRM if it's wrong > > is it really better (performance wise) than doing it always, unconditionaly (so minus frrm, minus beq) ? It's done by following reasons: 1. by the optimization guide, https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar. 2. aarch64 apply the similar optimization. Please also check discussion at: https://github.com/openjdk/jdk/pull/18758 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1566838329 From vkempik at openjdk.org Tue Apr 16 07:23:00 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 16 Apr 2024 07:23:00 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> Message-ID: On Tue, 16 Apr 2024 07:14:17 GMT, Hamlin Li wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1177: >> >>> 1175: // don't want non-IEEE rounding modes. >>> 1176: guarantee(RoundingMode::rne == 0, "must be"); >>> 1177: beq(tmp, zr, skip_fsrmi); // Only reset FRM if it's wrong >> >> is it really better (performance wise) than doing it always, unconditionaly (so minus frrm, minus beq) ? > > It's done by following reasons: > 1. by the optimization guide, https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar. > 2. aarch64 apply the similar optimization. > > Please also check discussion at: https://github.com/openjdk/jdk/pull/18758 Thanks for the links, however there are no performance claims in that discussion, only few "maybes" for "some hardware", can we check on existing h/w ( c910, u74) ? with jmh test doing dummy jni calls ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1566844775 From dlong at openjdk.org Tue Apr 16 07:25:03 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 07:25:03 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: Message-ID: <6jMVuGzeAz4eKC6_JTAcPtPKtT_awX7JmAUy9UWk18U=.298d0d4e-976d-4ed5-8274-5dc2cc9a0603@github.com> On Mon, 15 Apr 2024 15:08:31 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18309#pullrequestreview-2002808835 From aph at openjdk.org Tue Apr 16 07:38:44 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 07:38:44 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: <5A8NNOUFaH-momsXpI1F_idEp8wXdzWwMcsaU6xzMbw=.5e251738-3845-45c7-be26-003c823f7a7d@github.com> On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 76: > 74: } > 75: > 76: /** `//` comments here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18487#discussion_r1566867723 From mli at openjdk.org Tue Apr 16 07:58:59 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 16 Apr 2024 07:58:59 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> Message-ID: On Tue, 16 Apr 2024 07:19:36 GMT, Vladimir Kempik wrote: >> It's done by following reasons: >> 1. by the optimization guide, https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar. >> 2. aarch64 apply the similar optimization. >> >> Please also check discussion at: https://github.com/openjdk/jdk/pull/18758 > > Thanks for the links, however there are no performance claims in that discussion, only few "maybes" for "some hardware", can we check on existing h/w ( c910, u74) ? with jmh test doing dummy jni calls ? Thanks for the suggesion. c910 might be suitable for this kind test (u74 is in order, so not?), but I don't have the hardware now. If anyone happens to have it and can help test it that would be helpful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1566897616 From ihse at openjdk.org Tue Apr 16 08:37:02 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 16 Apr 2024 08:37:02 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Mon, 15 Apr 2024 13:10:22 GMT, Martin Doerr wrote: >> It was too bad that I did not see and review this change in the makefiles. :-( >> >> While you guys could have gone either way, I strongly dislike the choice to include a redefinition in the makefiles. If this really should be done, we should introduced a new variable to carry such changes, instead of piggybacking it with the OS defines. :-( But, I don't think it should be done at all. >> >> There are several reasons why this is a inferior solution: >> >> 1) It does not follow prior examples. We have tried hard before not do things like this, but rather pass flags as defines (e.g. `-DREDEFINE_ALLOCA` had been better) >> 2) It does not scale. If we start in effect allowing code in the command line, there is no clear limit anymore what should be placed in the source code files and what should be placed on the command line. >> 3) It messes up command lines. Keeping command lines as short as reasonable possible is a goal we try to strive for. In this case, there is also the `'` inside them (which I don't understand why), which is just begging for quoting/escaping problems, making command lines hard to copy/paste, send to different systems (like logging) etc. >> >> I'd really like to see a follow-up PR that moves this away from the command line define and into a source code file instead. > > Can we unconditionally `#include ` in all files which use `alloca`? Or does that disturb any platform? I don't think it does. From the documentation I've checked, `#include ` is valid everywhere. I'm not sure why the fact that it is a compiler-builtin is relevant here. The man page for alloca on Linux says: SYNOPSIS #include void *alloca(size_t size); which I take it as this is the formally correct way to use alloca. If it happens to work without the include file, that's just coincidence, and not a sign that we should remove `#include ` everywhere. @kimbarrett What do you say? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1566964500 From david.holmes at oracle.com Tue Apr 16 08:38:33 2024 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2024 18:38:33 +1000 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: <1f472b33-89f2-4947-91d8-14f8fdbe0aa9@oracle.com> Vote: yes David On 10/04/2024 10:24 pm, Jesper Wilhelmsson wrote: > I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. > > Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From david.holmes at oracle.com Tue Apr 16 08:38:59 2024 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2024 18:38:59 +1000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes David On 11/04/2024 11:24 pm, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He > has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves > and repo consolidations over the years [1], but he was one of the > original authors of the arm64 port ([2] shows 359 changes in the > mercurial hotspot sub repository alone), contributed JEP 352 (support > for NVM devices under byte buffers), and more recently has been active > in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]https://github.com/openjdk/jdk/commits/master/?author=adinn > > [2]https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > > [3]https://openjdk.org/census#members > [4]https://openjdk.org/groups/#member-vote > > From david.holmes at oracle.com Tue Apr 16 08:39:34 2024 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Apr 2024 18:39:34 +1000 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: <7fd142af-2c82-436b-8e83-44cba809b0a5@oracle.com> Vote: yes David On 10/04/2024 10:24 pm, Jesper Wilhelmsson wrote: > I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. > > Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From jkern at openjdk.org Tue Apr 16 08:51:47 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 16 Apr 2024 08:51:47 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Wed, 10 Apr 2024 22:14:33 GMT, Kim Barrett wrote: >> That build failure in shared code does not happen with Xcode clang, gcc, or >> Visual Studio, even though none of them appear to have a relevant define or >> include. So the clang variant being used for AIX is different from the Xcode >> clang variant (and maybe others) in its treatment of alloca. Weird! >> >> I can also live with either the macro or the includes where needed. I dislike >> conditionally adding the include in globalDefinitions_gcc.hpp. > > Should also remove the `#pragma alloca` in os_aix.cpp. We can not use `#include ` in all files which use `alloca`, because windows does not know this header. Maybe we can use `#include ` unconditionally in globalDefinitions_gcc.hpp, if windows will never use this file. @kimbarrett What do you say? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1566988309 From aph at openjdk.org Tue Apr 16 09:13:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 09:13:42 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 02:00:43 GMT, kuaiwei wrote: > > Hi, I guess this isn't quite ready for review let. I'll have another look whan it is. > > Is there any other gap I'm not aware? Well, you're asking me to speculate on what you're aware of, but the very first thing I see when I run "java -version" with this patch is this, so I assume you're not finished. 0x0000ffffe8ad2750: str w11, [x0, #0x14] ;*invokespecial {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.StringLatin1::replace at 123 (line 427) ;; membar_release 0x0000ffffe8ad2754: dmb ishld 0x0000ffffe8ad2758: dmb ishst ;*new {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.StringLatin1::replace at 116 (line 427) ;; membar_release 0x0000ffffe8ad275c: dmb ishld 0x0000ffffe8ad2760: dmb ishst ;*synchronization entry ; - java.lang.StringLatin1::replace at -1 (line 408) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2058612717 From eosterlund at openjdk.org Tue Apr 16 09:16:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 Apr 2024 09:16:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: <4fbVu6vtnp9zIlOmO_Vmzgfhj8jpvolEY4m8EKFR3a4=.4a28ed73-9187-48ed-bf7b-cbd7988a3150@github.com> On Fri, 12 Apr 2024 14:40:09 GMT, Patricio Chilano Mateo wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Nits > > src/hotspot/share/runtime/continuationFreezeThaw.cpp line 631: > >> 629: chunk->set_max_thawing_size(cont_size()); >> 630: chunk->set_bottom(chunk_start_sp - _cont.argsize()); >> 631: chunk->set_sp(chunk->bottom()); > > Do we need to set sp? We didn't do it before. We used to set sp to stack_size() before to say the stack is_empty during thaw. It had an imprecise and fuzzy notion of being empty that accepted various different bottoms depending on argsize. The new definition of is_empty is that bottom == sp, precisely. I don't want to update bottom during thaw, only when the chunk is initialized. Now that we are re-initializing the chunk with a new bottom (due to different arg size), we have to set the sp to bottom to signify that the chunk is_empty() right now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1567023174 From ihse at openjdk.org Tue Apr 16 09:18:03 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 16 Apr 2024 09:18:03 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: On Tue, 16 Apr 2024 08:49:02 GMT, Joachim Kern wrote: >> Should also remove the `#pragma alloca` in os_aix.cpp. > > We can not use `#include ` in all files which use `alloca`, because windows does not know this header. Maybe we can use `#include ` unconditionally in globalDefinitions_gcc.hpp, if windows will never use this file. @kimbarrett What do you say? That was kind of where the discussion started, and which Kim did not like. If I read him correctly, his suggestion was instead to place: #if defined(_AIX) #include #endif in the files where `alloca` is needed on AIX. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1567026294 From ihse at openjdk.org Tue Apr 16 09:18:03 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 16 Apr 2024 09:18:03 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> Message-ID: <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> On Tue, 16 Apr 2024 09:14:25 GMT, Magnus Ihse Bursie wrote: >> We can not use `#include ` in all files which use `alloca`, because windows does not know this header. Maybe we can use `#include ` unconditionally in globalDefinitions_gcc.hpp, if windows will never use this file. @kimbarrett What do you say? > > That was kind of where the discussion started, and which Kim did not like. If I read him correctly, his suggestion was instead to place: > > #if defined(_AIX) > #include > #endif > > in the files where `alloca` is needed on AIX. (If some of these files happen to be files which are not compiled on Windows, I assume it will not hurt to drop the ifdef guard, but then again, it can certainly be kept as well for consistency.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1567027594 From johan.sjolen at oracle.com Tue Apr 16 09:18:56 2024 From: johan.sjolen at oracle.com (Johan Sjolen) Date: Tue, 16 Apr 2024 09:18:56 +0000 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: Vote: yes Johan From eosterlund at openjdk.org Tue Apr 16 09:21:44 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 Apr 2024 09:21:44 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 15:07:03 GMT, Patricio Chilano Mateo wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Nits > > src/hotspot/share/runtime/continuationFreezeThaw.cpp line 567: > >> 565: // Consider leaving the chunk's argsize set when emptying it and removing the following branch, >> 566: // although that would require changing stackChunkOopDesc::is_empty >> 567: if (!chunk->is_empty()) { > > Seems you have implemented the suggestion in the comment so we can remove this branch and unconditionally decrement total_size_needed. I currently have an assert that checks that you shouldn't be asking for the argsize() if the chunk is empty, because it is so error prone. I think I'd like to keep the assert though - it was quite useful. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1567032762 From aph at openjdk.org Tue Apr 16 09:26:01 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 09:26:01 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... Why are you adding across lanes every time around the loop? You could maintain all of the lanes and then merge the lanes in the tail. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2058637287 From eosterlund at openjdk.org Tue Apr 16 09:36:13 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 Apr 2024 09:36:13 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v3] In-Reply-To: References: Message-ID: > When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. > > At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. > > In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). > > Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Partricio fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18643/files - new: https://git.openjdk.org/jdk/pull/18643/files/40ea7943..170b3184 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=01-02 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18643/head:pull/18643 PR: https://git.openjdk.org/jdk/pull/18643 From eosterlund at openjdk.org Tue Apr 16 09:36:13 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 Apr 2024 09:36:13 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v3] In-Reply-To: References: Message-ID: <3z6Uj5UfhD4NBWVasLhz3vk3NcU9LAqeTQPgZFcrjig=.1048230b-4275-4a95-8b26-0934d55ad5d5@github.com> On Fri, 12 Apr 2024 15:16:33 GMT, Patricio Chilano Mateo wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Partricio fixes > > src/hotspot/share/oops/stackChunkOop.inline.hpp line 135: > >> 133: >> 134: inline bool stackChunkOopDesc::is_empty() const { >> 135: assert(sp() <= stack_size(), ""); > > Maybe keep assert(sp() <= bottom(), "")? Fixed! > src/hotspot/share/runtime/continuationFreezeThaw.cpp line 662: > >> 660: >> 661: void FreezeBase::freeze_fast_copy(stackChunkOop chunk, int chunk_start_sp CONT_JFR_ONLY(COMMA bool chunk_is_allocated)) { >> 662: assert(chunk != nullptr, ""); > > Isn't this assert still valid? It is. Fixed! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1567051769 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1567051537 From aph at openjdk.org Tue Apr 16 09:42:22 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 09:42:22 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v17] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/c43e9c6a..a622c7dd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=15-16 Stats: 3 lines in 3 files changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From duke at openjdk.org Tue Apr 16 09:52:02 2024 From: duke at openjdk.org (Yuri Gaevsky) Date: Tue, 16 Apr 2024 09:52:02 GMT Subject: RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic In-Reply-To: References: Message-ID: On Wed, 7 Feb 2024 14:35:55 GMT, Yuri Gaevsky wrote: > Hello All, > > Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported. > > Thank you, > -Yuri Gaevsky > > **Correctness checks:** > hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4. Please keep it live, bot. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17750#issuecomment-2058688115 From luhenry at openjdk.org Tue Apr 16 09:52:59 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 16 Apr 2024 09:52:59 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> Message-ID: <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> On Tue, 16 Apr 2024 07:19:36 GMT, Vladimir Kempik wrote: >> It's done by following reasons: >> 1. by the optimization guide, https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar. >> 2. aarch64 apply the similar optimization. >> >> Please also check discussion at: https://github.com/openjdk/jdk/pull/18758 > > Thanks for the links, however there are no performance claims in that discussion, only few "maybes" for "some hardware", can we check on existing h/w ( c910, u74) ? with jmh test doing dummy jni calls ? @VladimirKempik https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar is written by multiple industry player who know how their hardware is going to behave, and it is confirmed that it will be more performant to have this `frrm`+`beq` than always doing `fsrm`. If you disagree with the wording in that guide, you're welcome to open a discussion on https://gitlab.com/riseproject/riscv-optimization-guide. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1567073210 From jwaters at openjdk.org Tue Apr 16 09:55:02 2024 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 16 Apr 2024 09:55:02 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> Message-ID: On Tue, 16 Apr 2024 09:15:19 GMT, Magnus Ihse Bursie wrote: >> That was kind of where the discussion started, and which Kim did not like. If I read him correctly, his suggestion was instead to place: >> >> #if defined(_AIX) >> #include >> #endif >> >> in the files where `alloca` is needed on AIX. > > (If some of these files happen to be files which are not compiled on Windows, I assume it will not hurt to drop the ifdef guard, but then again, it can certainly be kept as well for consistency.) Windows does use this file, in the unofficial Windows/gcc Port. That said, I am fairly sure the Windows distribution of gcc does recognise alloca.h. It would be a little strange to unconditionally include this if only AIX needs it, though ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1567076923 From vkempik at openjdk.org Tue Apr 16 10:14:48 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 16 Apr 2024 10:14:48 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> Message-ID: On Tue, 16 Apr 2024 09:48:48 GMT, Ludovic Henry wrote: >> Thanks for the links, however there are no performance claims in that discussion, only few "maybes" for "some hardware", can we check on existing h/w ( c910, u74) ? with jmh test doing dummy jni calls ? > > @VladimirKempik https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar is written by multiple industry player who know how their hardware is going to behave, and it is confirmed that it will be more performant to have this `frrm`+`beq` than always doing `fsrm`. If you disagree with the wording in that guide, you're welcome to open a discussion on https://gitlab.com/riseproject/riscv-optimization-guide. @luhenry I've already saw a big criticism of that document. So it makes me take it with a grain of salt. It was written by a limited group of vendors ( not all of them participated) and basically shows u-arch peculiarities of limited hw group. I'm gonna check few version of this patch in jni_blank jmh microtest ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1567100036 From luhenry at openjdk.org Tue Apr 16 10:14:49 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 16 Apr 2024 10:14:49 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> Message-ID: On Tue, 16 Apr 2024 10:09:27 GMT, Vladimir Kempik wrote: >> @VladimirKempik https://riscv-optimization-guide.riseproject.dev/#_controlling_rounding_behavior_scalar is written by multiple industry player who know how their hardware is going to behave, and it is confirmed that it will be more performant to have this `frrm`+`beq` than always doing `fsrm`. If you disagree with the wording in that guide, you're welcome to open a discussion on https://gitlab.com/riseproject/riscv-optimization-guide. > > @luhenry I've already saw a big criticism of that document. So it makes me take it with a grain of salt. It was written by a limited group of vendors ( not all of them participated) and basically shows u-arch peculiarities of limited hw group. > > I'm gonna check few version of this patch in jni_blank jmh microtest @VladimirKempik again, happy to get input from more people on that document! :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1567103936 From duke at openjdk.org Tue Apr 16 10:35:46 2024 From: duke at openjdk.org (Mikhail Ablakatov) Date: Tue, 16 Apr 2024 10:35:46 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 16 Apr 2024 09:22:49 GMT, Andrew Haley wrote: > Why are you adding across lanes every time around the loop? You could maintain all of the lanes and then merge the lanes in the tail. @theRealAph , thank you for a suggestion. That's because current result (hash sum) has to multiplied by 31^4 between iterations, where 4 is the numbers of elements handled per iteration. It's possible to multiply all lanes of `vmultiplication` register by 31^4 with `MUL (vector)` or `MUL (by element)` on each loop iteration and merge them just once in the end as you suggested though. I tried this approach before and it displays worse performance results on the benchmarks compared to the following sequence used in this PR: ```c++ addv(vmultiplication, Assembler::T4S, vmultiplication); umov(addend, vmultiplication, Assembler::S, 0); // Sign-extension isn't necessary maddw(result, result, pow4, addend); I can re-check and post the performance numbers here per a request. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2058767126 From vkempik at openjdk.org Tue Apr 16 10:51:59 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 16 Apr 2024 10:51:59 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> Message-ID: On Tue, 16 Apr 2024 10:12:36 GMT, Ludovic Henry wrote: >> @luhenry I've already saw a big criticism of that document. So it makes me take it with a grain of salt. It was written by a limited group of vendors ( not all of them participated) and basically shows u-arch peculiarities of limited hw group. >> >> I'm gonna check few version of this patch in jni_blank jmh microtest > > @VladimirKempik again, happy to get input from more people on that document! :) Here are results from c910 ( licheePi4a). using: a) jni_blank as is b) modified jni_blank where native func() does this: int x = 2; asm ("csrw fcsr,%0\n\t" : : "r" (x) ); branchful - exactly this PR branchless - this PR without csrr&beq results: no fcsr change in native code branchless Benchmark Mode Cnt Score Error Units CallOverheadConstant.jni_blank avgt 30 133.586 ? 1.431 ns/op CallOverheadVirtual.jni_blank avgt 30 131.715 ? 0.570 ns/op branchful Benchmark Mode Cnt Score Error Units CallOverheadConstant.jni_blank avgt 30 133.376 ? 1.491 ns/op CallOverheadVirtual.jni_blank avgt 30 133.560 ? 1.782 ns/op fcsr changed to rdn in native code branchless Benchmark Mode Cnt Score Error Units CallOverheadConstant.jni_blank avgt 30 153.708 ? 1.191 ns/op CallOverheadVirtual.jni_blank avgt 30 150.653 ? 1.617 ns/op branchful Benchmark Mode Cnt Score Error Units CallOverheadConstant.jni_blank avgt 30 153.595 ? 0.759 ns/op CallOverheadVirtual.jni_blank avgt 30 149.992 ? 1.605 ns/op Basically there are not difference here ( thanks to BranchPredictor), so why would you make code more complex (and require additional registers) ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1567141957 From luhenry at openjdk.org Tue Apr 16 10:56:00 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 16 Apr 2024 10:56:00 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> Message-ID: <_c-b9PZXvD0GM3Oamxgc7BBTarOS19Dubq_y9GYJb9M=.90899a76-dafc-44e7-a0ed-7ad7368e9a77@github.com> On Tue, 16 Apr 2024 10:45:47 GMT, Vladimir Kempik wrote: >> @VladimirKempik again, happy to get input from more people on that document! :) > > Here are results from c910 ( licheePi4a). using: > a) jni_blank as is > b) modified jni_blank where native func() does this: > > > int x = 2; > asm > ("csrw fcsr,%0\n\t" > : > : "r" (x) > ); > > branchful - exactly this PR > branchless - this PR without csrr&beq > > results: > > no fcsr change in native code > > > branchless > Benchmark Mode Cnt Score Error Units > CallOverheadConstant.jni_blank avgt 30 133.586 ? 1.431 ns/op > CallOverheadVirtual.jni_blank avgt 30 131.715 ? 0.570 ns/op > > > branchful > Benchmark Mode Cnt Score Error Units > CallOverheadConstant.jni_blank avgt 30 133.376 ? 1.491 ns/op > CallOverheadVirtual.jni_blank avgt 30 133.560 ? 1.782 ns/op > > > fcsr changed to rdn in native code > > > > branchless > Benchmark Mode Cnt Score Error Units > CallOverheadConstant.jni_blank avgt 30 153.708 ? 1.191 ns/op > CallOverheadVirtual.jni_blank avgt 30 150.653 ? 1.617 ns/op > > branchful > Benchmark Mode Cnt Score Error Units > CallOverheadConstant.jni_blank avgt 30 153.595 ? 0.759 ns/op > CallOverheadVirtual.jni_blank avgt 30 149.992 ? 1.605 ns/op > > > Basically there are not difference here ( thanks to BranchPredictor), so why would you make code more complex (and require additional registers) ? Great to see there is no performance degradation on the current generation of hardware. That's a great motivator to get this change in with the branch as we already know that the branch is going to bring a performance improvement on next/other generations of hardware. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1567150600 From aph at openjdk.org Tue Apr 16 10:58:04 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 10:58:04 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: <2ReTrE0inVkfcPNrq6JVrGRkoFuOZsLK6Ir0ZAnd_Kk=.13903f65-b747-4c38-9572-91e132ebd424@github.com> References: <2ReTrE0inVkfcPNrq6JVrGRkoFuOZsLK6Ir0ZAnd_Kk=.13903f65-b747-4c38-9572-91e132ebd424@github.com> Message-ID: On Mon, 15 Apr 2024 20:58:44 GMT, Vladimir Ivanov wrote: > Performance testing results look fine. I wonder, could you do me a little favour? Please run the performance tests with `-XX:-UseSecondarySuperCache`. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2058808288 From vkempik at openjdk.org Tue Apr 16 11:21:59 2024 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 16 Apr 2024 11:21:59 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: <_c-b9PZXvD0GM3Oamxgc7BBTarOS19Dubq_y9GYJb9M=.90899a76-dafc-44e7-a0ed-7ad7368e9a77@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> <_c-b9PZXvD0GM3Oamxgc7BBTarOS19Dubq_y9GYJb9M=.90899a76-dafc-44e7-a0ed-7ad7368e9a77@github.com> Message-ID: <_7lj0atNT-yDCUZE15g6TqGzQK8WrF9e2Vsn9AX-c_E=.041d78ba-74d3-4b52-8176-c4ad5eaaeed5@github.com> On Tue, 16 Apr 2024 10:52:58 GMT, Ludovic Henry wrote: >> Here are results from c910 ( licheePi4a). using: >> a) jni_blank as is >> b) modified jni_blank where native func() does this: >> >> >> int x = 2; >> asm >> ("csrw fcsr,%0\n\t" >> : >> : "r" (x) >> ); >> >> branchful - exactly this PR >> branchless - this PR without csrr&beq >> >> results: >> >> no fcsr change in native code >> >> >> branchless >> Benchmark Mode Cnt Score Error Units >> CallOverheadConstant.jni_blank avgt 30 133.586 ? 1.431 ns/op >> CallOverheadVirtual.jni_blank avgt 30 131.715 ? 0.570 ns/op >> >> >> branchful >> Benchmark Mode Cnt Score Error Units >> CallOverheadConstant.jni_blank avgt 30 133.376 ? 1.491 ns/op >> CallOverheadVirtual.jni_blank avgt 30 133.560 ? 1.782 ns/op >> >> >> fcsr changed to rdn in native code >> >> >> >> branchless >> Benchmark Mode Cnt Score Error Units >> CallOverheadConstant.jni_blank avgt 30 153.708 ? 1.191 ns/op >> CallOverheadVirtual.jni_blank avgt 30 150.653 ? 1.617 ns/op >> >> branchful >> Benchmark Mode Cnt Score Error Units >> CallOverheadConstant.jni_blank avgt 30 153.595 ? 0.759 ns/op >> CallOverheadVirtual.jni_blank avgt 30 149.992 ? 1.605 ns/op >> >> >> Basically there are not difference here ( thanks to BranchPredictor), so why would you make code more complex (and require additional registers) ? > > Great to see there is no performance degradation on the current generation of hardware. That's a great motivator to get this change in with the branch as we already know that the branch is going to bring a performance improvement on next/other generations of hardware. Even when showing middle finger to branch predictor: unsigned char prbarray[512]; static int setupme() { srand(time(NULL)); for(int i = 0; i < 512; i++) { int r = rand()%15; prbarray[i]= (r > 10) ? (r-10) : 0; } return 1; } EXPORT void func() { static int setup = 0; static unsigned long counter = 0; if(!setup) setup = setupme(); int x = prbarray[counter++%512]; asm ("csrw fcsr,%0\n\t" : : "r" (x) ); } results are still equal between branchless and branchful versions I'm ok with this change, buts lets wait for yet another opinion: do we accept such "future proof" changes or not ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1567180887 From azafari at openjdk.org Tue Apr 16 11:32:01 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 16 Apr 2024 11:32:01 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: On Mon, 15 Apr 2024 16:11:13 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > alignment in coding style changed. Tested `runtime/NMT/VirtualAlloc*.java` tests for master and PR branches. All `mmap` values are the same. Ready for next round of review. Ping @tstuefe, @stefank and @jdksjolen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2058865629 From aph at openjdk.org Tue Apr 16 12:30:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 12:30:42 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 16 Apr 2024 10:32:35 GMT, Mikhail Ablakatov wrote: > I can re-check and post the performance numbers here per a request. Please do. Please also post the code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2058972370 From yzheng at openjdk.org Tue Apr 16 12:40:22 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Tue, 16 Apr 2024 12:40:22 GMT Subject: RFR: 8330280: SharedRuntime::get_resolved_entry should not return c2i entry if the callee is special native intrinsic Message-ID: In https://github.com/openjdk/jdk/pull/18741 we return c2i entry for threads with interp_only_mode. This can be problematic for method handle intrinsics and continuation intrinsics, which cannot be interpreted. Consequently, we will cascade the c2i entry with an i2c entry and fail the runtime. The solution is to not return c2i entry under such circumstance. ------------- Commit messages: - SharedRuntime::get_resolved_entry should not return c2i entry if the callee is special native intrinsic. Changes: https://git.openjdk.org/jdk/pull/18799/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18799&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330280 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18799.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18799/head:pull/18799 PR: https://git.openjdk.org/jdk/pull/18799 From duke at openjdk.org Tue Apr 16 12:57:00 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 16 Apr 2024 12:57:00 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 09:10:38 GMT, Andrew Haley wrote: > > > Hi, I guess this isn't quite ready for review let. I'll have another look whan it is. > > > > > > Is there any other gap I'm not aware? > > Well, you're asking me to speculate on what you're aware of, but the very first thing I see when I run "java -version" with this patch is this, so I assume you're not finished. > > ``` > 0x0000ffffe8ad2750: str w11, [x0, #0x14] ;*invokespecial {reexecute=0 rethrow=0 return_oop=0} > ; - java.lang.StringLatin1::replace at 123 (line 427) > ;; membar_release > 0x0000ffffe8ad2754: dmb ishld > 0x0000ffffe8ad2758: dmb ishst ;*new {reexecute=0 rethrow=0 return_oop=0} > ; - java.lang.StringLatin1::replace at 116 (line 427) > ;; membar_release > 0x0000ffffe8ad275c: dmb ishld > 0x0000ffffe8ad2760: dmb ishst ;*synchronization entry > ; - java.lang.StringLatin1::replace at -1 (line 408) > ``` I verified and not reproduced the error. Last week I rebase the patch to resolve a conflict with master branch. I think the error may be caused by apply the patch to old base. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2059024841 From jsjolen at openjdk.org Tue Apr 16 14:04:20 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 16 Apr 2024 14:04:20 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v33] In-Reply-To: References: Message-ID: <5zyWfjezRXl0ArC3ftNyPJ071gpl_8t_7Tva-HHqCpo=.b7e67b98-7923-4d22-94eb-3ca40a2a4e5f@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Faulty assignment. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/5a3e6dd6..3a7c4ac1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=32 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=31-32 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From aph at openjdk.org Tue Apr 16 14:08:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 14:08:42 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 08:00:20 GMT, kuaiwei wrote: >> The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: >> 1 It show regression in some platform, like Apple silicon in mac os >> 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" >> >> It can be fixed by: >> 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) >> 2 Check the special pattern and merge the subsequent dmb. >> >> It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. >> >> This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. >> >> - Update: >> After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. > > kuaiwei has updated the pull request incrementally with one additional commit since the last revision: > > Fix arm build error Argh, I found it. It happens because C2 calls `masm->offset()` from `PhaseOutput::fill_buffer()` after every node is emitted. So that trick isn't going to work. It was worth a try, but given that C2 expects offset() to be correct after every node, I think we're stuck. Maybe the last idea you had is the best possible without C2 tinkering. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2059182775 From pchilanomate at openjdk.org Tue Apr 16 14:10:04 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Apr 2024 14:10:04 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 14:53:12 GMT, Patricio Chilano Mateo wrote: >> There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > take ResourceMark out of debug only Alright, thanks all for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18632#issuecomment-2059185955 From pchilanomate at openjdk.org Tue Apr 16 14:10:05 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Apr 2024 14:10:05 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: References: Message-ID: <3jIJIhmkBI-S-LVazKYmJY02nlHSISOMGSaL2A5Fv78=.09c89b7a-1a08-4ffc-8162-a28187711b77@github.com> On Mon, 15 Apr 2024 18:43:35 GMT, Aleksey Shipilev wrote: >> In theory yes, although I doubt it is an actual issue because allocating in the resource area for this _bit_mask field is a rare case, and the memory allocated will be a few bytes. But I guess we can keep the eager cleanup just in case since it doesn't hurt. > > All right, your call. FWIW, this opportunistic cleanup is ugly, and I am happy to see it go. Agree. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1567429383 From pchilanomate at openjdk.org Tue Apr 16 14:13:06 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Apr 2024 14:13:06 GMT Subject: Integrated: 8329665: fatal error: memory leak: allocating without ResourceMark In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 16:23:50 GMT, Patricio Chilano Mateo wrote: > There are two places in Loom code that call f.oops_interpreted_do() to process oops in the stackChunk. Although not obvious this method seem to require to have a ResourceMark on scope and there are several contexts where these two are call where we don't have one. The reason why a ResourceMark is needed is because OopMapCache::compute_one_oop_map() might allocate from the resource area if _mask_size is > 4 * BitsPerWord, which depends on the amount of locals + expression stack of the corresponding method. But ~InterpreterOopMap already checks if the _bit_mask was allocated in the resource area and in that case it will free it. So the ResourceMark is not strictly needed except that in debug mode we will actually hit the assert if there is not one in scope when trying to allocate the _bit_mask. > > Thanks, > Patricio This pull request has now been integrated. Changeset: e073d5b3 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/e073d5b37422c2adad18db520c5f4fcf120c147b Stats: 14 lines in 3 files changed: 1 ins; 13 del; 0 mod 8329665: fatal error: memory leak: allocating without ResourceMark Reviewed-by: dholmes, shade, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/18632 From jsjolen at openjdk.org Tue Apr 16 14:19:25 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 16 Apr 2024 14:19:25 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Style, copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/3a7c4ac1..2707ee86 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=33 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=32-33 Stats: 116 lines in 6 files changed: 83 ins; 32 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From aph at openjdk.org Tue Apr 16 14:25:12 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 16 Apr 2024 14:25:12 GMT Subject: Integrated: 8180450: secondary_super_cache does not scale well In-Reply-To: References: Message-ID: <5_LAyQ4rLr4yoxUgMTnmTThf_smIuslTW6YfzNq1zis=.ae8472be-6e7d-4dc5-a84f-9adb72d595d1@github.com> On Thu, 14 Mar 2024 18:24:11 GMT, Andrew Haley wrote: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... This pull request has now been integrated. Changeset: f11a496d Author: Andrew Haley URL: https://git.openjdk.org/jdk/commit/f11a496de61d800a680517457eb43b078a633953 Stats: 2227 lines in 40 files changed: 2179 ins; 18 del; 30 mod 8180450: secondary_super_cache does not scale well Co-authored-by: Vladimir Ivanov Reviewed-by: kvn, vlivanov, dlong ------------- PR: https://git.openjdk.org/jdk/pull/18309 From pchilanomate at openjdk.org Tue Apr 16 14:25:41 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Apr 2024 14:25:41 GMT Subject: RFR: 8330280: SharedRuntime::get_resolved_entry should not return c2i entry if the callee is special native intrinsic In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 12:35:42 GMT, Yudi Zheng wrote: > In https://github.com/openjdk/jdk/pull/18741 we return c2i entry for threads with interp_only_mode. This can be problematic for method handle intrinsics and continuation intrinsics, which cannot be interpreted. Consequently, we will cascade the c2i entry with an i2c entry and fail the runtime. The solution is to not return c2i entry under such circumstance. Looks good to me, thanks Yudi. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18799#pullrequestreview-2003807531 From coleenp at openjdk.org Tue Apr 16 14:30:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Apr 2024 14:30:47 GMT Subject: RFR: 8329665: fatal error: memory leak: allocating without ResourceMark [v2] In-Reply-To: <3jIJIhmkBI-S-LVazKYmJY02nlHSISOMGSaL2A5Fv78=.09c89b7a-1a08-4ffc-8162-a28187711b77@github.com> References: <3jIJIhmkBI-S-LVazKYmJY02nlHSISOMGSaL2A5Fv78=.09c89b7a-1a08-4ffc-8162-a28187711b77@github.com> Message-ID: On Tue, 16 Apr 2024 14:06:37 GMT, Patricio Chilano Mateo wrote: >> All right, your call. FWIW, this opportunistic cleanup is ugly, and I am happy to see it go. > > Agree. me too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18632#discussion_r1567464875 From jkratochvil at openjdk.org Tue Apr 16 14:44:07 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 16 Apr 2024 14:44:07 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true IMHO `is_containerized()` is OK to return `false` even when running in a container but with no limitations set. Container detection is IIUC/AFAIK being used to maximize resource usage by OpenJDK. But if OpenJDK runs in a container with the same limits as the hardware box OpenJDK should still use reduced resources as it is sharing them with other processes on the hardware box. [is-containerized.patch.txt](https://github.com/openjdk/jdk/files/14998503/is-containerized.patch.txt) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2059261807 From pchilanomate at openjdk.org Tue Apr 16 14:59:16 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Apr 2024 14:59:16 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v4] In-Reply-To: References: Message-ID: <6BON-wIAQT8w_gf2PcmQQ9QPjjskmczdK-1JNRXnGHA=.1b4c9c98-9298-4fd1-acec-c10a5294fc54@github.com> > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'JDK-8329665' into JDK-8325469 - take ResourceMark out of debug only ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18637/files - new: https://git.openjdk.org/jdk/pull/18637/files/ab275358..dd2a1da8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=02-03 Stats: 15 lines in 3 files changed: 1 ins; 14 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18637/head:pull/18637 PR: https://git.openjdk.org/jdk/pull/18637 From sgehwolf at openjdk.org Tue Apr 16 15:19:41 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 16 Apr 2024 15:19:41 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 14:40:46 GMT, Jan Kratochvil wrote: > IMHO `is_containerized()` is OK to return `false` even when running in a container but with no limitations set. The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2059344194 From kvn at openjdk.org Tue Apr 16 15:57:03 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 15:57:03 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: <9wT-mL_BWh583PJEdw5DjgkbvqZB5abgPYsAUJMzTHA=.f62b51c8-b8c2-47b8-bcb5-57265523c75f@github.com> References: <64kGNHR9SmKW6rkPphO1my45Rte6w07v9V7Nf04GNN4=.0ac11f40-5e92-4367-82be-95410dca6ee5@github.com> <9wT-mL_BWh583PJEdw5DjgkbvqZB5abgPYsAUJMzTHA=.f62b51c8-b8c2-47b8-bcb5-57265523c75f@github.com> Message-ID: On Tue, 16 Apr 2024 06:48:05 GMT, Dean Long wrote: >> I thought about that but in both places where these accessors are called (`frame::get_native_monitor()` and `frame::get_native_receiver()`) there are such asserts already: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/frame.cpp#L1085 > > OK, but I'd rather see it in the accessors too. Some users are checking for method()->is_native() and others are checking for is_osr_method(), so we need to make sure those are always mutually exclusive: method()->is_native() != is_osr_method(). We have separate `nmethod()` constructor for native method wrappers and it sets `_entry_bci = InvocationEntryBci;`. So it is impossible to have OSR native method wrapper. But I agree with adding assert into accessors to catch accidental usage for not native methods. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567603094 From dlong at openjdk.org Tue Apr 16 16:23:59 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 16 Apr 2024 16:23:59 GMT Subject: RFR: 8330280: SharedRuntime::get_resolved_entry should not return c2i entry if the callee is special native intrinsic In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 12:35:42 GMT, Yudi Zheng wrote: > In https://github.com/openjdk/jdk/pull/18741 we return c2i entry for threads with interp_only_mode. This can be problematic for method handle intrinsics and continuation intrinsics, which cannot be interpreted. Consequently, we will cascade the c2i entry with an i2c entry and fail the runtime. The solution is to not return c2i entry under such circumstance. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18799#pullrequestreview-2004110343 From kvn at openjdk.org Tue Apr 16 16:27:43 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 16:27:43 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 03:31:25 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Use 16-bits types for header_size and frame_complete_offset arguments @dean-long, thank you for review. I applied all your suggestions and push it after testing. ------------- PR Review: https://git.openjdk.org/jdk/pull/18768#pullrequestreview-2004070926 From kvn at openjdk.org Tue Apr 16 16:27:47 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 16:27:47 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 06:13:59 GMT, Dean Long wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Union fields which usages do not overlap > > src/hotspot/share/code/nmethod.cpp line 1235: > >> 1233: int skipped_insts_size = code_buffer->total_skipped_instructions_size(); >> 1234: #ifdef ASSERT >> 1235: assert(((skipped_insts_size >> 16) == 0), "size is bigger than 64Kb: %d", skipped_insts_size); > > Suggestion: > > > I think it's simpler just to use checked_cast below. Agree > src/hotspot/share/code/nmethod.cpp line 1240: > >> 1238: int consts_offset = code_buffer->total_offset_of(code_buffer->consts()); >> 1239: assert(consts_offset == 0, "const_offset: %d", consts_offset); >> 1240: #endif > > Suggestion: I assume you are suggesting to remove `#ifdef ASSERT`. Done. > src/hotspot/share/code/nmethod.cpp line 1241: > >> 1239: assert(consts_offset == 0, "const_offset: %d", consts_offset); >> 1240: #endif >> 1241: _skipped_instructions_size = (uint16_t)skipped_insts_size; > > Suggestion: > > _skipped_instructions_size = checked_cast(code_buffer->total_skipped_instructions_size()); Done. > src/hotspot/share/code/nmethod.cpp line 1441: > >> 1439: int deps_size = align_up((int)dependencies->size_in_bytes(), oopSize); >> 1440: int sum_size = oops_size + metadata_size + deps_size; >> 1441: assert((sum_size >> 16) == 0, "data size is bigger than 64Kb: %d", sum_size); > > I suggest using checked_cast for the assignment below, rather than special-purpose checks here. Okay. But I will put above code under `#ifdef ASSERT` then. > src/hotspot/share/code/nmethod.cpp line 1445: > >> 1443: _metadata_offset = (uint16_t)oops_size; >> 1444: _dependencies_offset = _metadata_offset + (uint16_t)metadata_size; >> 1445: _scopes_pcs_offset = _dependencies_offset + (uint16_t)deps_size; > > Use checked_cast instead of raw casts. okay > src/hotspot/share/code/nmethod.cpp line 1459: > >> 1457: assert((data_offset() + data_end_offset) <= nmethod_size, "wrong nmethod's size: %d < %d", nmethod_size, (data_offset() + data_end_offset)); >> 1458: >> 1459: _entry_offset = (uint16_t)offsets->value(CodeOffsets::Entry); > > Use checked_cast. done > src/hotspot/share/memory/heap.hpp line 58: > >> 56: void set_length(size_t length) { >> 57: LP64_ONLY( assert(((length >> 32) == 0), "sanity"); ) >> 58: _header._length = (uint32_t)length; > > Suggestion: > > _header._length = checked_castlength; Done. > src/hotspot/share/memory/heap.hpp line 63: > >> 61: // Accessors >> 62: void* allocated_space() const { return (void*)(this + 1); } >> 63: size_t length() const { return (size_t)_header._length; } > > This cast looks unnecessary. Agree. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567619512 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567620520 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567620735 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567627565 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567636013 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567638682 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567644204 PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567645140 From coleenp at openjdk.org Tue Apr 16 16:35:45 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Apr 2024 16:35:45 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 03:31:25 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Use 16-bits types for header_size and frame_complete_offset arguments src/hotspot/share/code/codeBlob.cpp line 106: > 104: > 105: // Simple CodeBlob used for simple BufferBlob. > 106: CodeBlob::CodeBlob(const char* name, CodeBlobKind kind, int size, uint16_t header_size) : Just a drive-by comment. You might be able to use delegating constructors for CodeBlob so you don't have to have the field initializations twice. Maybe the same for nmethod ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567658758 From sgibbons at openjdk.org Tue Apr 16 17:50:45 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 16 Apr 2024 17:50:45 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff I ran the benchmark again using a branch from openjdk today against this modification. Performance increased an average of 4.98x (max - 7.52x (unsafe 64-byte aligned), min - 3.02x (panama 255-byte aligned), stddev - 0.94x). I'll integrate after a second review has been done. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2059637042 From stuefe at openjdk.org Tue Apr 16 18:29:04 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 16 Apr 2024 18:29:04 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: <8MpoLKDw6usz92EBH9R1XWfnX0E7NU5fd2dv8tob2ho=.455c310f-cadb-484d-a40f-6fd7e2c0811c@github.com> On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true I am not enough of a container expert to judge if the basic approach is right - I trust you on this. This is just a technical code review. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 170: > 168: } > 169: } > 170: return false; An alternative, simpler, no need for modifying source string: static bool find_ro_opt(const char* o) { return strcmp(o, "ro") || strstr(o, ",ro") || strstr(o, "ro,"); } src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 351: > 349: // > 350: // We collect the read only mount option in the cgroup infos so as to have that > 351: // info ready when determining is_containerized(). Here, and in other places: a comment indicating the line format we scan would be appreciated, possibly with argument numbers. Saves the casual code reader from looking into proc man page. Even just pasting the example line for proc manpage would be fine (https://man7.org/linux/man-pages/man5/proc.5.html) (but with order adapted to your scanf call, they count major:minor as one) src/hotspot/os/linux/osContainer_linux.cpp line 78: > 76: const char *reason; > 77: bool any_mem_cpu_limit_present = false; > 78: bool ctrl_ro = cgroup_subsystem->is_containerized(); nit: naming? what does ctrl mean in this case? Maybe use "cgroup_is_containerized"? src/java.base/share/classes/sun/launcher/LauncherHelper.java line 375: > 373: if (!c.isContainerized()) { > 374: ostream.println(INDENT + "System not containerized."); > 375: return; Why return here? Would this not cut the output short in the non-containerized case? And if this not intended, the not-containerized-`-XshowSettings:system` test below should test and catch this (e.g. scan for CPU set) ------------- PR Review: https://git.openjdk.org/jdk/pull/18201#pullrequestreview-1999328503 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1564182879 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567756663 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567774124 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567779248 From stuefe at openjdk.org Tue Apr 16 18:29:05 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 16 Apr 2024 18:29:05 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: <8MpoLKDw6usz92EBH9R1XWfnX0E7NU5fd2dv8tob2ho=.455c310f-cadb-484d-a40f-6fd7e2c0811c@github.com> References: <8MpoLKDw6usz92EBH9R1XWfnX0E7NU5fd2dv8tob2ho=.455c310f-cadb-484d-a40f-6fd7e2c0811c@github.com> Message-ID: On Sat, 13 Apr 2024 18:29:59 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8261242-is-containerized-fix >> - jcheck fixes >> - Fix tests >> - Implement Metrics.isContainerized() >> - Some clean-up >> - Drop cgroups testing on plain Linux >> - Implement fall-back logic for non-ro controller mounts >> - Make find_ro static and local to compilation unit >> - 8261242: [Linux] OSContainer::is_containerized() returns true > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 170: > >> 168: } >> 169: } >> 170: return false; > > An alternative, simpler, no need for modifying source string: > > static bool find_ro_opt(const char* o) { > return strcmp(o, "ro") || strstr(o, ",ro") || strstr(o, "ro,"); > } Please disregard my comment. Albeit longer, your version is clearer to read and more fault tolerant. > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 351: > >> 349: // >> 350: // We collect the read only mount option in the cgroup infos so as to have that >> 351: // info ready when determining is_containerized(). > > Here, and in other places: a comment indicating the line format we scan would be appreciated, possibly with argument numbers. Saves the casual code reader from looking into proc man page. Even just pasting the example line for proc manpage would be fine (https://man7.org/linux/man-pages/man5/proc.5.html) (but with order adapted to your scanf call, they count major:minor as one) Trying to parse the `%s%*[^-]-` So, %s parses the mount options, until we encounter whitespace. Then %*[^-]- parses everything that is not a dash, until we encounter the dash? Then we eat the dash? This is to skip the optionals? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567754861 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567767209 From kvn at openjdk.org Tue Apr 16 18:36:11 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 18:36:11 GMT Subject: RFR: 8329433: Reduce nmethod header size [v5] In-Reply-To: References: Message-ID: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1322 bytecoded nmethods for C2: > total in heap = 8307120 (100%) > header = 327856 (3.946687%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Address comments. Used checked_cast. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18768/files - new: https://git.openjdk.org/jdk/pull/18768/files/6cb22e81..8405a23d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=03-04 Stats: 32 lines in 3 files changed: 4 ins; 11 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From pchilanomate at openjdk.org Tue Apr 16 18:41:11 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 16 Apr 2024 18:41:11 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v5] In-Reply-To: References: Message-ID: > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into JDK-8325469 - Merge branch 'JDK-8329665' into JDK-8325469 - take ResourceMark out of debug only - use WhiteBox to verify OSR compilation - fix comment - v1 - v1 ------------- Changes: https://git.openjdk.org/jdk/pull/18637/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18637&range=04 Stats: 270 lines in 16 files changed: 246 ins; 5 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/18637.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18637/head:pull/18637 PR: https://git.openjdk.org/jdk/pull/18637 From kvn at openjdk.org Tue Apr 16 18:58:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 18:58:01 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 16:33:18 GMT, Coleen Phillimore wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Use 16-bits types for header_size and frame_complete_offset arguments > > src/hotspot/share/code/codeBlob.cpp line 106: > >> 104: >> 105: // Simple CodeBlob used for simple BufferBlob. >> 106: CodeBlob::CodeBlob(const char* name, CodeBlobKind kind, int size, uint16_t header_size) : > > Just a drive-by comment. You might be able to use delegating constructors for CodeBlob so you don't have to have the field initializations twice. Maybe the same for nmethod ? Thank you, @coleenp, foe looking on these changes. Which fields are initialized twice? Only `_oop_maps` is set to `nullptr` before we proper build oop maps in first constructor. The only saving could be lines of code but then I would have to check that `cb != nullptr` and do other additional checks which I don't think will save much lines. Separation of `nmethod` constructor for native wrappers is helping clear see the difference and I would like to keep them separate. We have `init_defaults()` method for similar code and I can move more code into it from both constructors. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567814189 From coleenp at openjdk.org Tue Apr 16 19:06:05 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 Apr 2024 19:06:05 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: Message-ID: <73tOotwsnkVfT0_0moY8U333EHAjV2MJ3amY6mMF58Y=.a5003aa6-4aea-4e20-82cc-0466403a9503@github.com> On Tue, 16 Apr 2024 18:54:40 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/code/codeBlob.cpp line 106: >> >>> 104: >>> 105: // Simple CodeBlob used for simple BufferBlob. >>> 106: CodeBlob::CodeBlob(const char* name, CodeBlobKind kind, int size, uint16_t header_size) : >> >> Just a drive-by comment. You might be able to use delegating constructors for CodeBlob so you don't have to have the field initializations twice. Maybe the same for nmethod ? > > Thank you, @coleenp, foe looking on these changes. > > Which fields are initialized twice? Only `_oop_maps` is set to `nullptr` before we proper build oop maps in first constructor. > > The only saving could be lines of code but then I would have to check that `cb != nullptr` and do other additional checks which I don't think will save much lines. > > Separation of `nmethod` constructor for native wrappers is helping clear see the difference and I would like to keep them separate. We have `init_defaults()` method for similar code and I can move more code into it from both constructors. Delegating constructors are the answer to having some common 'init' functions. It would simply save lines of code that look the same in both constructor initializer lists. But it's a drive-by comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567823393 From vlivanov at openjdk.org Tue Apr 16 19:06:10 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 16 Apr 2024 19:06:10 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: <2ReTrE0inVkfcPNrq6JVrGRkoFuOZsLK6Ir0ZAnd_Kk=.13903f65-b747-4c38-9572-91e132ebd424@github.com> Message-ID: On Tue, 16 Apr 2024 10:54:52 GMT, Andrew Haley wrote: > I wonder, could you do me a little favour? Please run the performance tests with -XX:-UseSecondarySuperCache. Thanks. Sure, I'll let you know once the testing is over. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2059749332 From kvn at openjdk.org Tue Apr 16 19:24:03 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 19:24:03 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: <73tOotwsnkVfT0_0moY8U333EHAjV2MJ3amY6mMF58Y=.a5003aa6-4aea-4e20-82cc-0466403a9503@github.com> References: <73tOotwsnkVfT0_0moY8U333EHAjV2MJ3amY6mMF58Y=.a5003aa6-4aea-4e20-82cc-0466403a9503@github.com> Message-ID: On Tue, 16 Apr 2024 19:03:01 GMT, Coleen Phillimore wrote: >> Thank you, @coleenp, foe looking on these changes. >> >> Which fields are initialized twice? Only `_oop_maps` is set to `nullptr` before we proper build oop maps in first constructor. >> >> The only saving could be lines of code but then I would have to check that `cb != nullptr` and do other additional checks which I don't think will save much lines. >> >> Separation of `nmethod` constructor for native wrappers is helping clear see the difference and I would like to keep them separate. We have `init_defaults()` method for similar code and I can move more code into it from both constructors. > > Delegating constructors are the answer to having some common 'init' functions. It would simply save lines of code that look the same in both constructor initializer lists. But it's a drive-by comment. Okay. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567841490 From kvn at openjdk.org Tue Apr 16 19:50:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 19:50:01 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: <73tOotwsnkVfT0_0moY8U333EHAjV2MJ3amY6mMF58Y=.a5003aa6-4aea-4e20-82cc-0466403a9503@github.com> Message-ID: On Tue, 16 Apr 2024 19:20:37 GMT, Vladimir Kozlov wrote: >> Delegating constructors are the answer to having some common 'init' functions. It would simply save lines of code that look the same in both constructor initializer lists. But it's a drive-by comment. > > Okay. It is tempting to do for `nmethod` to replace `init_defaults()`. I will look what can be done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567866651 From kvn at openjdk.org Tue Apr 16 22:15:02 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 16 Apr 2024 22:15:02 GMT Subject: RFR: 8329433: Reduce nmethod header size [v4] In-Reply-To: References: <73tOotwsnkVfT0_0moY8U333EHAjV2MJ3amY6mMF58Y=.a5003aa6-4aea-4e20-82cc-0466403a9503@github.com> Message-ID: <6SY8mNq-z-vgrB2djgCc3mDTZpwniPi0lVy_sVULu84=.3882b3c9-80b8-4171-ae3d-a1f9d8261c6a@github.com> On Tue, 16 Apr 2024 19:47:06 GMT, Vladimir Kozlov wrote: >> Okay. > > It is tempting to do for `nmethod` to replace `init_defaults()`. I will look what can be done. It does not work. It does not allow fields initialization after delegation: src/hotspot/share/code/nmethod.cpp: In constructor 'nmethod::nmethod(Method*, CompilerType, int, int, CodeOffsets*, CodeBuffer*, int, ByteSize, ByteSize, OopMapSet*)': src/hotspot/share/code/nmethod.cpp:1232:56: error: mem-initializer for 'nmethod::::::_native_receiver_sp_offset' follows constructor delegation Yes, I can move some fields initialization into body - but then it will be duplicated initialization. >From reading net, the delegated constructor should take the largest number of arguments which default values are set by delegating constructors. It is not easy applicable to `nmethod` constructors - it will add more lines of code than remove. `nmethod` constructors we have are too big and different enough that delegation will not improve code. I will keep `init_defaults()` and extend it to reduce common code in constructors. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1567992881 From mikael.vidstedt at oracle.com Tue Apr 16 22:51:32 2024 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 16 Apr 2024 22:51:32 +0000 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: <62F0D4D3-30B9-4C9E-8A52-DB526316AB3A@oracle.com> Vote: yes Cheers, Mikael On Apr 11, 2024, at 6:24?AM, Thomas Stuefe wrote: Hi, I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. Andrew is a well-known and respected member of the OpenJDK community. He has been a contributor since the early days of OpenJDK. The history of his contributions has been mangled by various SCM moves and repo consolidations over the years [1], but he was one of the original authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository alone), contributed JEP 352 (support for NVM devices under byte buffers), and more recently has been active in the Graal and the Leyden projects. Votes are due by April 25, 2024. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Cheers, Thomas [1] https://github.com/openjdk/jdk/commits/master/?author=adinn [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot [3] https://openjdk.org/census#members [4] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikael.vidstedt at oracle.com Tue Apr 16 22:51:58 2024 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 16 Apr 2024 22:51:58 +0000 Subject: CFV: New HotSpot Group Member: Fredrik Bredberg In-Reply-To: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> References: <0291F74B-D724-4B97-B9D0-5FC57FA0F302@oracle.com> Message-ID: <8DC66E31-0D6B-4C55-AF55-FE7C8194BC70@oracle.com> Vote: yes Cheers, Mikael > On Apr 10, 2024, at 5:24?AM, Jesper Wilhelmsson wrote: > > I hereby nominate Fredrik Bredberg (fbredberg) to Membership in the HotSpot Group. > > Fredrik is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. Fredrik has mainly focused his efforts in the Loom area and is frequently helping out with platform specific (including assembler) code for other areas as well. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From mikael.vidstedt at oracle.com Tue Apr 16 22:52:22 2024 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 16 Apr 2024 22:52:22 +0000 Subject: CFV: New HotSpot Group Member: Afshin Zafari In-Reply-To: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> References: <5088FFE6-F5E5-4B57-8FB9-B5F6672C7D7F@oracle.com> Message-ID: Vote: yes Cheers, Mikael > On Apr 10, 2024, at 5:24?AM, Jesper Wilhelmsson wrote: > > I hereby nominate Afshin Zafari (azafari) to Membership in the HotSpot Group. > > Afshin is a Committer in the JDK project, and a member of the Oracle JVM Runtime team. He has fixed 42 issues including several significant changes in various parts of the JVM runtime and has lately focused on NMT improvements. > > Votes are due by April 24, 2024. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From never at openjdk.org Wed Apr 17 00:20:45 2024 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 17 Apr 2024 00:20:45 GMT Subject: RFR: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal [v5] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 19:30:26 GMT, Tom Rodriguez wrote: >> This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. > > Tom Rodriguez has updated the pull request incrementally with one additional commit since the last revision: > > Update some comments Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17714#issuecomment-2060104716 From never at openjdk.org Wed Apr 17 00:20:46 2024 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 17 Apr 2024 00:20:46 GMT Subject: Integrated: 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal In-Reply-To: References: Message-ID: <91pDTjfcsu7QnWLLIo5FhSKwVXq8SmJrUgPVcS67EPc=.a0a3c58d-69dd-418f-8b1f-83fc512162d5@github.com> On Mon, 5 Feb 2024 20:24:32 GMT, Tom Rodriguez wrote: > This fixes some lurking issues with JVMCI and nmethod related both BarrierSetNMethod and the garbage collection of nmethods. In particular the stack walking in c2v_iterateFrames visits many frames and needs a KeepStackGCProcessedMark for safety. Additionally, JVMCI interacts with nmethods in complex ways and needs some sort of strong root during these interactions. A new JavaThread field has been added that mirrors the way JVMTI keeps nmethods alive. This pull request has now been integrated. Changeset: f6f038a6 Author: Tom Rodriguez URL: https://git.openjdk.org/jdk/commit/f6f038a678c450e1157247344fb0984c7bcaa11d Stats: 96 lines in 7 files changed: 74 ins; 8 del; 14 mod 8317368: [JVMCI] SIGSEGV in JVMCIEnv::initialize_installed_code on libgraal Reviewed-by: dnsimon, kvn, eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/17714 From sspitsyn at openjdk.org Wed Apr 17 00:35:10 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 17 Apr 2024 00:35:10 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed Message-ID: This is a simple fix of three similar asserts. The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. I've seen similar issue and already fixed it in this fragment of code: class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { . . . void do_vthread(Handle target_h) { assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); // use jvmti_vthread() as vthread() can be outdated assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); . . . The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. There are three places which need to be fixed the same way: - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` - `SetForceEarlyReturn::do_vthread(Handle target_h)` - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` Testing: - Run mach5 tiers 1-6 ------------- Commit messages: - add comments explaining that the vthread() can return outdated oop - 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed Changes: https://git.openjdk.org/jdk/pull/18806/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330303 Stats: 6 lines in 2 files changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18806.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18806/head:pull/18806 PR: https://git.openjdk.org/jdk/pull/18806 From pchilanomate at openjdk.org Wed Apr 17 00:49:07 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 17 Apr 2024 00:49:07 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v3] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 09:36:13 GMT, Erik ?sterlund wrote: >> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. >> >> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. >> >> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >> >> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Partricio fixes Thanks Erik, looks good to me. Some extra comments. src/hotspot/share/runtime/continuationFreezeThaw.cpp line 624: > 622: freeze_fast_copy(chunk, chunk_start_sp CONT_JFR_ONLY(COMMA false)); > 623: } else { // the chunk is empty > 624: const int chunk_start_sp = chunk->stack_size() - frame::metadata_words_at_top; Do we need the minus frame::metadata_words_at_top? Since the chunk is empty I would expect the chunk_start_sp to be the same as for the new chunk case which is just chunk->stack_size(). ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-2004791719 PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1568074330 From pchilanomate at openjdk.org Wed Apr 17 00:49:08 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 17 Apr 2024 00:49:08 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 09:19:08 GMT, Erik ?sterlund wrote: >> src/hotspot/share/runtime/continuationFreezeThaw.cpp line 567: >> >>> 565: // Consider leaving the chunk's argsize set when emptying it and removing the following branch, >>> 566: // although that would require changing stackChunkOopDesc::is_empty >>> 567: if (!chunk->is_empty()) { >> >> Seems you have implemented the suggestion in the comment so we can remove this branch and unconditionally decrement total_size_needed. > > I currently have an assert that checks that you shouldn't be asking for the argsize() if the chunk is empty, because it is so error prone. I think I'd like to keep the assert though - it was quite useful. We should be okay since _cont.argsize gets it from the ContinuationEntry. I tested it a bit and we would also need to update _fast_freeze_size to be cont_size() in the chunk empty case before calling freeze_fast_copy() otherwise we hit an assert there. But I can do this in another RFR if you want. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1568071510 From kvn at openjdk.org Wed Apr 17 00:56:33 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 00:56:33 GMT Subject: RFR: 8329433: Reduce nmethod header size [v6] In-Reply-To: References: Message-ID: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1322 bytecoded nmethods for C2: > total in heap = 8307120 (100%) > header = 327856 (3.946687%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request incrementally with two additional commits since the last revision: - remove trailing space - Shuffle fields initialization ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18768/files - new: https://git.openjdk.org/jdk/pull/18768/files/8405a23d..6a164b11 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=04-05 Stats: 158 lines in 3 files changed: 73 ins; 75 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From jkratochvil at openjdk.org Wed Apr 17 01:09:46 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Wed, 17 Apr 2024 01:09:46 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 15:17:33 GMT, Severin Gehwolf wrote: > The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. The in-container tuning means to use all the available resources. Containers in the real world have some memory limits set which is where my modified patch still correctly identifies it as a container to use all the available resources of the node which is the whole goal of the container detection code. > In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. I expect it should have been written "We need a reliable way to distinguish real world in-container vs. non-containerized setup. We do not mind behavior for artificial containers on OpenJDK development machines.". Which is what my patch does in an easier and less error-prone way. > If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used. That behaves still the same with my patch. Could you give a countercase where my patch behaves wrongly? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2060158409 From duke at openjdk.org Wed Apr 17 02:12:01 2024 From: duke at openjdk.org (kuaiwei) Date: Wed, 17 Apr 2024 02:12:01 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 14:06:14 GMT, Andrew Haley wrote: > Argh, I found it. It happens because C2 calls `masm->offset()` from `PhaseOutput::fill_buffer()` after every node is emitted. So that trick isn't going to work. > > It was worth a try, but given that C2 expects offset() to be correct after every node, I think we're stuck. Maybe the last idea you had is the best possible without C2 tinkering. got it. I will check if we can make offset() work with fill_buffer. Or I will rollback the change of offset(). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2060210278 From syan at openjdk.org Wed Apr 17 02:13:02 2024 From: syan at openjdk.org (SendaoYan) Date: Wed, 17 Apr 2024 02:13:02 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v5] In-Reply-To: References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> Message-ID: On Fri, 12 Apr 2024 16:28:32 GMT, Severin Gehwolf wrote: > Looks OK to me. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2060211460 From jpai at openjdk.org Wed Apr 17 02:17:01 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 17 Apr 2024 02:17:01 GMT Subject: RFR: Merge 33d7127 Message-ID: This brings in the CPU24_04 changes. ------------- Commit messages: - 8322122: Enhance generation of addresses - 8319851: Improve exception logging - 8318340: Improve RSA key implementations - 8315708: Enhance HTTP/2 client usage The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/jdk/pull/18807/files Stats: 182 lines in 19 files changed: 58 ins; 51 del; 73 mod Patch: https://git.openjdk.org/jdk/pull/18807.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18807/head:pull/18807 PR: https://git.openjdk.org/jdk/pull/18807 From amitkumar at openjdk.org Wed Apr 17 03:27:59 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 17 Apr 2024 03:27:59 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" @RealLucy would be best if you could take a look at this one as well :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2060275888 From iklam at openjdk.org Wed Apr 17 03:50:28 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 17 Apr 2024 03:50:28 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump [v2] In-Reply-To: References: Message-ID: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into 8323900-avoid-os-init-random-in-static-cds-dump - 8323900: Avoid calling os::init_random() in CDS static dump ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18728/files - new: https://git.openjdk.org/jdk/pull/18728/files/23883dcb..7e252cea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18728&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18728&range=00-01 Stats: 13732 lines in 395 files changed: 8260 ins; 2724 del; 2748 mod Patch: https://git.openjdk.org/jdk/pull/18728.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18728/head:pull/18728 PR: https://git.openjdk.org/jdk/pull/18728 From fyang at openjdk.org Wed Apr 17 03:51:59 2024 From: fyang at openjdk.org (Fei Yang) Date: Wed, 17 Apr 2024 03:51:59 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: <_7lj0atNT-yDCUZE15g6TqGzQK8WrF9e2Vsn9AX-c_E=.041d78ba-74d3-4b52-8176-c4ad5eaaeed5@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> <_c-b9PZXvD0GM3Oamxgc7BBTarOS19Dubq_y9GYJb9M=.90899a76-dafc-44e7-a0ed-7ad7368e9a77@github.com> <_7lj0atNT-yDCUZE15g6TqGzQK8WrF9e2Vs n9AX-c_E=.041d78ba-74d3-4b52-8176-c4ad5eaaeed5@github.com> Message-ID: On Tue, 16 Apr 2024 11:18:58 GMT, Vladimir Kempik wrote: >> Great to see there is no performance degradation on the current generation of hardware. That's a great motivator to get this change in with the branch as we already know that the branch is going to bring a performance improvement on next/other generations of hardware. > > Even when showing middle finger to branch predictor: > > unsigned char prbarray[512]; > static int setupme() > { > srand(time(NULL)); > for(int i = 0; i < 512; i++) > { > int r = rand()%15; > prbarray[i]= (r > 10) ? (r-10) : 0; > } > > return 1; > } > > EXPORT void func() { > static int setup = 0; > static unsigned long counter = 0; > if(!setup) setup = setupme(); > int x = prbarray[counter++%512]; > asm > ("csrw fcsr,%0\n\t" > : > : "r" (x) > ); > } > > > results are still equal between branchless and branchful versions > > I'm ok with this change, buts lets wait for yet another opinion: do we accept such "future proof" changes or not I don't have a strong opinion on this. This code will only be enabled by the user on command line in case we are calling some buggy external libraries which may corrupt the FP control register. It won't make a difference for the most normal cases. My local tests show that `frrm` is 3x faster than `fsrmi` on sifive/u74. So it does make sense to have the branch check on platforms like this. It should not be a big issue to other more advanced platforms with a good BranchPredictor. Suggesion: `beqz(tmp, skip_fsrmi); // Only reset FRM if it's wrong` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1568178068 From fyang at openjdk.org Wed Apr 17 03:58:41 2024 From: fyang at openjdk.org (Fei Yang) Date: Wed, 17 Apr 2024 03:58:41 GMT Subject: RFR: 8330094: RISC-V: Save and restore FCSR in the call stub [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 16:30:14 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. >> Thanks! >> >> Tests running ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > refine code src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 304: > 302: __ frrm(t0); > 303: __ sd(t0, frm_save); > 304: // Set fcsr to the state we need. We do want Round to Nearest. We I think it will be more accurate to mention `frm` instead of `fcsr` both in code comment and the JBS title? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 308: > 306: Label skip_fsrmi; > 307: guarantee(__ RoundingMode::rne == 0, "must be"); > 308: __ beq(t0, zr, skip_fsrmi); Suggestion: `__ beqz(t0, skip_fsrmi);` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1568181960 PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1568180315 From iklam at openjdk.org Wed Apr 17 05:34:02 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 17 Apr 2024 05:34:02 GMT Subject: RFR: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 05:53:21 GMT, Thomas Stuefe wrote: >>> Thinking about this, since global entropy (archived object ihashes) sneak into archives whether we use local seeds or not, maybe we should not bother with such a patch. >>> >>> In other words, if global state affects the archive anyway, we may just as well roll with it. >>> >>> See #18735 >> >> In CDS, we intend to be as much independent of the global JVM state as possible. For example, since [JDK-8296344](https://bugs.openjdk.org/browse/JDK-8296344), we no longer make a copy of the the archived heap objects in the actual Java heap. >> >> The intention of this PR is the same -- the contents of archived Symbols should not depend on the value of the os::random() seed. >> >> In #18735 you found that some other contents of the CDS archive depend on the JVM's os::random() seed. That may be something we want to fix separately. In any case, that's not a reason to not proceed with this PR. > >> > Thinking about this, since global entropy (archived object ihashes) sneak into archives whether we use local seeds or not, maybe we should not bother with such a patch. >> > In other words, if global state affects the archive anyway, we may just as well roll with it. >> > See #18735 >> >> In CDS, we intend to be as much independent of the global JVM state as possible. For example, since [JDK-8296344](https://bugs.openjdk.org/browse/JDK-8296344), we no longer make a copy of the the archived heap objects in the actual Java heap. >> >> The intention of this PR is the same -- the contents of archived Symbols should not depend on the value of the os::random() seed. >> >> In #18735 you found that some other contents of the CDS archive depend on the JVM's os::random() seed. That may be something we want to fix separately. In any case, that's not a reason to not proceed with this PR. > > Okay. Thinking about this, an isolated seed is always better, since it provides safety against concurrent uses of os::random (which can happen even at initialization time). > > So my approval stands. Thanks @tstuefe and @calvinccheung for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18728#issuecomment-2060384869 From iklam at openjdk.org Wed Apr 17 05:34:02 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 17 Apr 2024 05:34:02 GMT Subject: Integrated: 8323900: Avoid calling os::init_random() in CDS static dump In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 16:31:08 GMT, Ioi Lam wrote: > The purpose of the PR is to avoid modifying the global JVM state while dumping the CDS archive. > > When updating the identity hashcode for archived Symbols, call `ArchiveBuilder::current()->entropy()` instead of `os::random()`. As a result, CDS no longer needs to call `os::init_random()` with a deterministic seed. This pull request has now been integrated. Changeset: 2fe2f3af Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/2fe2f3aff82f41a3b7942861e29ccbd3bcc68661 Stats: 23 lines in 4 files changed: 14 ins; 6 del; 3 mod 8323900: Avoid calling os::init_random() in CDS static dump Reviewed-by: stuefe, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/18728 From aboldtch at openjdk.org Wed Apr 17 05:41:26 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 17 Apr 2024 05:41:26 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v2] In-Reply-To: References: Message-ID: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Running JVMTI reproducer. Starting full testing soon. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Handle previous bc being monitorenter - Remove implicit conditions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/4e00138a..03a4e045 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=00-01 Stats: 6 lines in 1 file changed: 4 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From lucy at openjdk.org Wed Apr 17 06:30:01 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 17 Apr 2024 06:30:01 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" LGTM. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18709#pullrequestreview-2005147495 From amitkumar at openjdk.org Wed Apr 17 06:39:01 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 17 Apr 2024 06:39:01 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" This one also seems a trivial change; Should I wait of another review or integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2060482794 From stuefe at openjdk.org Wed Apr 17 06:50:03 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 17 Apr 2024 06:50:03 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: Message-ID: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> On Tue, 16 Apr 2024 14:19:25 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Style, copyright Looked closer at the treap. I think we need really good gtests for this. src/hotspot/share/nmt/nmtTreap.hpp line 53: > 51: const K _key; > 52: V _value; > 53: using Nd = TreapNode; Can we call this "Node" please :), and nd_pair node_pair? src/hotspot/share/nmt/nmtTreap.hpp line 66: > 64: LEQ // <= > 65: }; > 66: Both split and merge use recursion, which worries me, since it is not limited in any way I can see. And I think we can rely on the compiler doing tail recursion optimization. Could a degenerated tree cause stack overflows? Can we do this without recursion? src/hotspot/share/nmt/nmtTreap.hpp line 142: > 140: } > 141: } > 142: Allocating: Why do we need to pass in allocators and deallocators as arguments? The only reason I can see is to have per-instance allocators that use tree-local variables. But why does it need access to the tree instance it works for at all? The only thing I see it accesses from the treap is the random seed. That should be using a global seed var, and os::next_random. A global seed var would be perfectly fine in this case, I think. The treap is only accessed under lock protection, right? Even if not, os::random() (the variant using a global seed with CAS) would be a valid alternative. (Arguably, initializing the rng seed is also not the job of an allocator, but of the function calling the allocator) So, a simplification would be to have the allocator and deallocator as class template parameters for the treapnode or for the trea. That would also get of the many duplicate definitions of free functions that all call os::free, for instance. A further simplification would be to have just one Allocator template parameter giving me an allocation and a deallocation function, since they come and work in pairs. A further simplification would be to have these actually be part of the tree, not the tree node: I prefer the individual nodes in a tree or list or similar to be dumb data holders, and for the containing tree to hold the tree related logic of merging, inserting etc. I think that is the more natural distribution of responsibilities. It also simplifies things that are done on a per-tree basis, like accounting. It avoids having to marshall a lot of calls (e.g. Tree->upsert just delegates to TreeNode->upsert) src/hotspot/share/nmt/nmtTreap.hpp line 187: > 185: free(head); > 186: } > 187: return nullptr; ? Why does this return anything src/hotspot/share/nmt/nmtTreap.hpp line 190: > 188: } > 189: }; > 190: TreapCHeap: I dislike having all tree logic in a tree class that is already bound to a specific allocation pattern. I may want to reuse the tree with different allocators, e.g. to avoid malloc. Another reason to make the allocator part of the template parameter of a tree. src/hotspot/share/nmt/nmtTreap.hpp line 196: > 194: friend class VMATreeTest; > 195: using CTreap = TreapNode; > 196: CTreap* root; What is a CTreap? To my mind, the treap is the tree, and a node in that tree is a treapnode. Can we not just call CTreap a node? And maybe use the same name we used above in TreapNode? What meaning has the C? src/hotspot/share/nmt/nmtTreap.hpp line 197: > 195: using CTreap = TreapNode; > 196: CTreap* root; > 197: uint64_t prng_seed; Leading underscore for member vars? src/hotspot/share/nmt/nmtTreap.hpp line 213: > 211: static const uint64_t PrngModMask = (static_cast(1) << PrngModPower) - 1; > 212: prng_seed = (PrngMult * prng_seed + PrngAdd) & PrngModMask; > 213: return prng_seed; Do we need our own random generator here? Why not use os::next_random? ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2005033795 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568283280 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568301951 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568268168 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568304592 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568295979 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568230895 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568228678 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568229090 From stuefe at openjdk.org Wed Apr 17 06:50:03 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 17 Apr 2024 06:50:03 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <0G_oRg-MB6aRKXpHJ4ca8lIQ72ZhsA2WBujtJ8BQaD0=.bbbc53c2-cb49-4051-998e-e9e48e4ea516@github.com> Message-ID: On Mon, 25 Mar 2024 10:10:34 GMT, Johan Sj?len wrote: >> I tend to not use operator-overloading so I didn't think of that as a possibility. > > One annoying part about such a design is that you can't use pointers to values as keys, they must now be wrapped within their own type (like `StackIndex` does). Yes, there's a bit of a smell of YAGNI here, but at least `std::set` agrees with me on `Compare` being a template argument. Could the comparator be a funktor then? Something with a static compare function? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568227233 From stuefe at openjdk.org Wed Apr 17 06:50:03 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 17 Apr 2024 06:50:03 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 06:39:20 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Style, copyright > > src/hotspot/share/nmt/nmtTreap.hpp line 66: > >> 64: LEQ // <= >> 65: }; >> 66: > > Both split and merge use recursion, which worries me, since it is not limited in any way I can see. And I think we can rely on the compiler doing tail recursion optimization. Could a degenerated tree cause stack overflows? Can we do this without recursion? Hint: test with a rng that returns +1 of last return. Lets see how the tree copes with that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568302894 From ayang at openjdk.org Wed Apr 17 06:58:08 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 17 Apr 2024 06:58:08 GMT Subject: RFR: 8330463: Rename invalidate to write_region in ModRefBarrierSet Message-ID: Simple renaming of a barrier-set API. ------------- Commit messages: - write-region-api Changes: https://git.openjdk.org/jdk/pull/18808/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18808&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330463 Stats: 18 lines in 7 files changed: 0 ins; 6 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/18808.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18808/head:pull/18808 PR: https://git.openjdk.org/jdk/pull/18808 From mli at openjdk.org Wed Apr 17 07:08:27 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 17 Apr 2024 07:08:27 GMT Subject: RFR: 8330094: RISC-V: Save and restore FRM in the call stub [v4] In-Reply-To: References: Message-ID: > Hi, > Can you help to review this patch? > As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. > Thanks! > > Tests running ... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: fix comment; minor refinement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18758/files - new: https://git.openjdk.org/jdk/pull/18758/files/7da4b991..11522488 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18758&range=02-03 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18758.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18758/head:pull/18758 PR: https://git.openjdk.org/jdk/pull/18758 From mli at openjdk.org Wed Apr 17 07:13:44 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 17 Apr 2024 07:13:44 GMT Subject: RFR: 8330094: RISC-V: Save and restore FRM in the call stub [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 03:55:34 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> refine code > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 304: > >> 302: __ frrm(t0); >> 303: __ sd(t0, frm_save); >> 304: // Set fcsr to the state we need. We do want Round to Nearest. We > > I think it will be more accurate to mention `frm` instead of `fcsr` in both code comment and the JBS title? agree, fixed > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 308: > >> 306: Label skip_fsrmi; >> 307: guarantee(__ RoundingMode::rne == 0, "must be"); >> 308: __ beq(t0, zr, skip_fsrmi); > > Suggestion: `__ beqz(t0, skip_fsrmi);` fixed, thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1568338049 PR Review Comment: https://git.openjdk.org/jdk/pull/18758#discussion_r1568338079 From mli at openjdk.org Wed Apr 17 07:14:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 17 Apr 2024 07:14:10 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v3] In-Reply-To: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: > Hi, > Can you help to review this patch? > As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. > Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 > > Thanks Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: minor refinement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18785/files - new: https://git.openjdk.org/jdk/pull/18785/files/59a488d7..68964755 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18785&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18785&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18785.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18785/head:pull/18785 PR: https://git.openjdk.org/jdk/pull/18785 From mli at openjdk.org Wed Apr 17 07:14:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 17 Apr 2024 07:14:10 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> <_c-b9PZXvD0GM3Oamxgc7BBTarOS19Dubq_y9GYJb9M=.90899a76-dafc-44e7-a0ed-7ad7368e9a77@github.com> <_7lj0atNT-yDCUZE15g6TqGzQK8WrF9e2Vs n9AX-c_E=.041d78ba-74d3-4b52-8176-c4ad5eaaeed5@github.com> Message-ID: On Wed, 17 Apr 2024 03:47:31 GMT, Fei Yang wrote: > Suggesion: beqz(tmp, skip_fsrmi); // Only reset FRM if it's wrong fixed, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1568337193 From mli at openjdk.org Wed Apr 17 07:14:10 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 17 Apr 2024 07:14:10 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v2] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> <0Y-y04yv-vcYPW5lYbwQbCNEmuwva8vCqLLBEZw9bs8=.141dedbc-b2c7-40bf-ad08-744720dca8ec@github.com> <9LU3bHt5W2Fdr2dfnf8xJpPlgVN0yDTgI7Um3g8ymF4=.1363434f-190e-4615-ac59-bf2e7349f831@github.com> <_MmpvQxke00mSc2ekq84E10CiwoLY6j37YJIFWP4PeI=.8d938434-4cd7-4a17-9862-02fc10810ecc@github.com> <_c-b9PZXvD0GM3Oamxgc7BBTarOS19Dubq_y9GYJb9M=.90899a76-dafc-44e7-a0ed-7ad7368e9a77@github.com> <_7lj0atNT-yDCUZE15g6TqGzQK8WrF9e2Vs n9AX-c_E=.041d78ba-74d3-4b52-8176-c4ad5eaaeed5@github.com> Message-ID: On Wed, 17 Apr 2024 07:10:47 GMT, Hamlin Li wrote: >> I don't have a strong opinion on this. This code will only be enabled by the user on command line in case we are calling some buggy external libraries which may corrupt the FP control register. It won't make a difference for the most normal cases. My local tests show that `frrm` is 3x faster than `fsrmi` on sifive/u74. So it does make sense to have the branch check on platforms like this. It should not be a big issue to other more advanced platforms with a good BranchPredictor. >> >> Suggesion: `beqz(tmp, skip_fsrmi); // Only reset FRM if it's wrong` > >> Suggesion: beqz(tmp, skip_fsrmi); // Only reset FRM if it's wrong > > fixed, thanks! Thanks everyone for testing and discussion! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18785#discussion_r1568337333 From jsjolen at openjdk.org Wed Apr 17 08:07:02 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 08:07:02 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 06:41:55 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Style, copyright > > src/hotspot/share/nmt/nmtTreap.hpp line 187: > >> 185: free(head); >> 186: } >> 187: return nullptr; > > ? Why does this return anything It returns the resulting `TreapNode*`. A `TreapNode*` which is `nullptr` is an empty treap. > src/hotspot/share/nmt/nmtTreap.hpp line 196: > >> 194: friend class VMATreeTest; >> 195: using CTreap = TreapNode; >> 196: CTreap* root; > > What is a CTreap? To my mind, the treap is the tree, and a node in that tree is a treapnode. Can we not just call CTreap a node? And maybe use the same name we used above in TreapNode? > > What meaning has the C? It's C-Heap allocated :-). Doesn't matter, just wanted a name that connects to the class name and the C sticks out. Changed to Node. > src/hotspot/share/nmt/nmtTreap.hpp line 197: > >> 195: using CTreap = TreapNode; >> 196: CTreap* root; >> 197: uint64_t prng_seed; > > Leading underscore for member vars? Yes! Thanks. > src/hotspot/share/nmt/nmtTreap.hpp line 213: > >> 211: static const uint64_t PrngModMask = (static_cast(1) << PrngModPower) - 1; >> 212: prng_seed = (PrngMult * prng_seed + PrngAdd) & PrngModMask; >> 213: return prng_seed; > > Do we need our own random generator here? Why not use os::next_random? TIL, I wasn't aware of that function. I specifically wanted something where each `Treap` stores its own seed/RNG state. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568401643 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568405212 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568402347 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568403841 From jsjolen at openjdk.org Wed Apr 17 08:18:03 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 08:18:03 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 06:40:19 GMT, Thomas Stuefe wrote: >> src/hotspot/share/nmt/nmtTreap.hpp line 66: >> >>> 64: LEQ // <= >>> 65: }; >>> 66: >> >> Both split and merge use recursion, which worries me, since it is not limited in any way I can see. And I think we can rely on the compiler doing tail recursion optimization. Could a degenerated tree cause stack overflows? Can we do this without recursion? > > Hint: test with a rng that returns +1 of last return. Lets see how the tree copes with that. The acceptability of recursion fully depends on the tree being balanced as this leads to a call stack depth of O(log n). split and merge have recursive self-calls in non-tail positions, so I doubt that this will be optimized out. A treap with a good RNG cannot be worse than something like `4*log_2(n)`. I see two ways forward: 1. There are iterative ways of creating a Treap, we could use those. It's a bit more work. I'll have to do research. 2. Reify the callstack as a heap-allocated linked list of activation frames and run the code on that. This modifies the linear code to pushing activation frames onto a stack and a bit. I'll get back here with an example of what that looks like. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568419448 From stuefe at openjdk.org Wed Apr 17 08:25:00 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 17 Apr 2024 08:25:00 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 08:14:56 GMT, Johan Sj?len wrote: > The acceptability of recursion fully depends on the tree being balanced as this leads to a call stack depth of O(log n). split and merge have recursive self-calls in non-tail positions, so I doubt that this will be optimized out. Yeah that's what I meant. Meant to say "cannot". > > A treap with a good RNG cannot be worse than something like `4*log_2(n)`. > > I see two ways forward: > > 1. There are iterative ways of creating a Treap, we could use those. It's a bit more work. I'll have to do research. > > 2. Reify the callstack as a heap-allocated linked list of activation frames and run the code on that. This modifies the linear code to pushing activation frames onto a stack and a bit. > > > I'll get back here with an example of what that looks like. There is an argument to be made for simplicity too. If you think degeneration is super improbable, it may be okay. But we should probably assert at some depth rather than rely on the potentially missing stack guard. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568427413 From jsjolen at openjdk.org Wed Apr 17 08:25:01 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 08:25:01 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 06:01:07 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Style, copyright > > src/hotspot/share/nmt/nmtTreap.hpp line 142: > >> 140: } >> 141: } >> 142: > > Allocating: > > Why do we need to pass in allocators and deallocators as arguments? The only reason I can see is to have per-instance allocators that use tree-local variables. But why does it need access to the tree instance it works for at all? The only thing I see it accesses from the treap is the random seed. That should be using a global seed var, and os::next_random. A global seed var would be perfectly fine in this case, I think. The treap is only accessed under lock protection, right? Even if not, os::random() (the variant using a global seed with CAS) would be a valid alternative. > > (Arguably, initializing the rng seed is also not the job of an allocator, but of the function calling the allocator) > > So, a simplification would be to have the allocator and deallocator as class template parameters for the treapnode or for the trea. That would also get of the many duplicate definitions of free functions that all call os::free, for instance. > > A further simplification would be to have just one Allocator template parameter giving me an allocation and a deallocation function, since they come and work in pairs. > > A further simplification would be to have these actually be part of the tree, not the tree node: I prefer the individual nodes in a tree or list or similar to be dumb data holders, and for the containing tree to hold the tree related logic of merging, inserting etc. I think that is the more natural distribution of responsibilities. It also simplifies things that are done on a per-tree basis, like accounting. It avoids having to marshall a lot of calls (e.g. Tree->upsert just delegates to TreeNode->upsert) Sure, the allocator isn't responsible for seeding the nodes. It was simply convenient that the class which does allocation also happens to hold the seed, so I could have them do both together. I came to the opposite conclusion that you did: I didn't want to have a global shared state seed as that makes me have to think about the behavior of the RNG in the presence of multiple threads and instances of the treap. >A further simplification would be to have these actually be part of the tree, not the tree node: I prefer the individual nodes in a tree or list or similar to be dumb data holders, and for the containing tree to hold the tree related logic of merging, inserting etc. I think that is the more natural distribution of responsibilities. It also simplifies things that are done on a per-tree basis, like accounting. It avoids having to marshall a lot of calls (e.g. Tree->upsert just delegates to TreeNode->upsert) Right, I could do that. I'll look into this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1568429806 From jsjolen at openjdk.org Wed Apr 17 08:29:06 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 08:29:06 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: <6sJbmLaVQHTxZGczGIRAy-F2R5sj8tt9AsxHosXbrf8=.aa299ef7-a496-4691-b898-4813f6ffa3ce@github.com> On Tue, 9 Apr 2024 13:41:35 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Style and copyright fix > >> Right, the refactoring to remove the `friend` declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked. >> >> Last good hash: [7445999](https://github.com/openjdk/jdk/commit/7445999ee296872320f91146e1004026ba1133c7) > > God, sorry. Do as you think is best. > > I plan to look at this PR, but probably it will not be this week. > > Love your commit messages btw. @tstuefe , I'll ping you (like I just did) in the comments when I've got a new version for the Treap stuff. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2060692018 From stefank at openjdk.org Wed Apr 17 08:33:59 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 17 Apr 2024 08:33:59 GMT Subject: RFR: 8326957: Implementation of JEP 474: ZGC: Generational Mode by Default In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 09:24:51 GMT, Axel Boldt-Christmas wrote: > This is the implementation task for `JEP 474: ZGC: Generational Mode by Default`. See the JEP for details. [JDK-8326667](https://bugs.openjdk.org/browse/JDK-8326667) Looks good. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18393#pullrequestreview-2005383586 From eosterlund at openjdk.org Wed Apr 17 08:40:00 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 17 Apr 2024 08:40:00 GMT Subject: RFR: 8326957: Implementation of JEP 474: ZGC: Generational Mode by Default In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 09:24:51 GMT, Axel Boldt-Christmas wrote: > This is the implementation task for `JEP 474: ZGC: Generational Mode by Default`. See the JEP for details. [JDK-8326667](https://bugs.openjdk.org/browse/JDK-8326667) Looks good. One nit in terminology, don't need to see the new version. src/hotspot/share/gc/x/xInitialize.cpp line 44: > 42: VM_Version::vm_release(), > 43: VM_Version::jdk_debug_level()); > 44: log_info(gc, init)("Using deprecated single-generation mode"); Change to non-generational instead of single generation, for consistency. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18393#pullrequestreview-2005396592 PR Review Comment: https://git.openjdk.org/jdk/pull/18393#discussion_r1568450065 From syan at openjdk.org Wed Apr 17 08:41:48 2024 From: syan at openjdk.org (SendaoYan) Date: Wed, 17 Apr 2024 08:41:48 GMT Subject: Integrated: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: On Tue, 12 Mar 2024 09:06:45 GMT, SendaoYan wrote: > Hi, > > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. > > > Thanks, > -sendao This pull request has now been integrated. Changeset: 7744b004 Author: SendaoYan Committer: Severin Gehwolf URL: https://git.openjdk.org/jdk/commit/7744b0046af4dbacb7068ae819d8a973cfbf8e40 Stats: 39 lines in 1 file changed: 15 ins; 0 del; 24 mod 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 Reviewed-by: sgehwolf ------------- PR: https://git.openjdk.org/jdk/pull/18225 From aboldtch at openjdk.org Wed Apr 17 08:47:08 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 17 Apr 2024 08:47:08 GMT Subject: RFR: 8326957: Implementation of JEP 474: ZGC: Generational Mode by Default [v2] In-Reply-To: References: Message-ID: > This is the implementation task for `JEP 474: ZGC: Generational Mode by Default`. See the JEP for details. [JDK-8326667](https://bugs.openjdk.org/browse/JDK-8326667) Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Remove extra space - Use consistent terminology ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18393/files - new: https://git.openjdk.org/jdk/pull/18393/files/ca9492d9..e80424c6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18393&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18393&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18393.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18393/head:pull/18393 PR: https://git.openjdk.org/jdk/pull/18393 From yzheng at openjdk.org Wed Apr 17 09:12:01 2024 From: yzheng at openjdk.org (Yudi Zheng) Date: Wed, 17 Apr 2024 09:12:01 GMT Subject: Integrated: 8330280: SharedRuntime::get_resolved_entry should not return c2i entry if the callee is special native intrinsic In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 12:35:42 GMT, Yudi Zheng wrote: > In https://github.com/openjdk/jdk/pull/18741 we return c2i entry for threads with interp_only_mode. This can be problematic for method handle intrinsics and continuation intrinsics, which cannot be interpreted. Consequently, we will cascade the c2i entry with an i2c entry and fail the runtime. The solution is to not return c2i entry under such circumstance. This pull request has now been integrated. Changeset: 3ccbc6d4 Author: Yudi Zheng Committer: Doug Simon URL: https://git.openjdk.org/jdk/commit/3ccbc6d4d014fb1ea92c47d270efd5f7ec05b0c3 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8330280: SharedRuntime::get_resolved_entry should not return c2i entry if the callee is special native intrinsic Reviewed-by: pchilanomate, dlong ------------- PR: https://git.openjdk.org/jdk/pull/18799 From jsjolen at openjdk.org Wed Apr 17 09:35:17 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 09:35:17 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v35] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with five additional commits since the last revision: - Aaand this. - Also fix this part - Forgot a template parameter :-) - Refactor TreapNode/TreapCHeap - Style in Treap ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/2707ee86..141a38c3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=34 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=33-34 Stats: 177 lines in 4 files changed: 63 ins; 72 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From gli at openjdk.org Wed Apr 17 09:44:00 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 17 Apr 2024 09:44:00 GMT Subject: RFR: 8330463: Rename invalidate to write_region in ModRefBarrierSet In-Reply-To: References: Message-ID: <6y1UGPKQLUiEZUk_KlcGu1tgkzqqifDespBOsn0dzgk=.6720979c-910c-4d28-8def-daa00ee77671@github.com> On Wed, 17 Apr 2024 06:52:48 GMT, Albert Mingkun Yang wrote: > Simple renaming of a barrier-set API. Marked as reviewed by gli (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18808#pullrequestreview-2005550071 From jsjolen at openjdk.org Wed Apr 17 09:45:21 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 09:45:21 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v36] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: One more thing! ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/141a38c3..70ee8362 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=35 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=34-35 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From ayang at openjdk.org Wed Apr 17 09:53:09 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 17 Apr 2024 09:53:09 GMT Subject: RFR: 8330475: Remove unused default value for ModRefBarrierSet::write_ref_array_pre Message-ID: Trivial removing unnecessary code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/18812/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18812&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330475 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18812/head:pull/18812 PR: https://git.openjdk.org/jdk/pull/18812 From gli at openjdk.org Wed Apr 17 10:17:43 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 17 Apr 2024 10:17:43 GMT Subject: RFR: 8330475: Remove unused default value for ModRefBarrierSet::write_ref_array_pre In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 09:46:54 GMT, Albert Mingkun Yang wrote: > Trivial removing unnecessary code. Marked as reviewed by gli (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18812#pullrequestreview-2005652288 From jsjolen at openjdk.org Wed Apr 17 10:32:21 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 10:32:21 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v37] In-Reply-To: References: Message-ID: <8jsUvTeTy6BVYnePExh8STh2Su5bUNciu4pOiG4mrOk=.6ca30fee-e57a-4c09-b48b-a6bf1cafb94d@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/70ee8362..d29d4dc0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=36 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=35-36 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From dfuchs at openjdk.org Wed Apr 17 10:34:41 2024 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Wed, 17 Apr 2024 10:34:41 GMT Subject: RFR: Merge 33d7127 In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 01:09:30 GMT, Jaikiran Pai wrote: > This brings in the CPU24_04 changes. This looks reasonable. I haven't been involved in all the fixes here - but I haven't spotted anything obviously wrong. The changes to the ConnectionPool look right. If some of the fixes had conflicts that required fixing then asking confirmation from one person involved in their respective reviews would be good. ------------- Marked as reviewed by dfuchs (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18807#pullrequestreview-2005685119 From jpai at openjdk.org Wed Apr 17 10:40:41 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 17 Apr 2024 10:40:41 GMT Subject: RFR: Merge 33d7127 In-Reply-To: References: Message-ID: <-11ibMJGQY3-ik4a6PAmd2pFKovbIHCSKFcdZfFN034=.4d09b5c4-b95e-4bdb-8ad6-8400d6330988@github.com> On Wed, 17 Apr 2024 01:09:30 GMT, Jaikiran Pai wrote: > This brings in the CPU24_04 changes. Thank you Daniel for the review. > If some of the fixes had conflicts that required fixing then asking confirmation from one person involved in their respective reviews would be good. The specific commits in this merge were all clean and didn't require any conflict resolution. Internal CI testing of tier1, tier2 and tier3 too completed successfully without issues. I'll go ahead and integrate this shortly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18807#issuecomment-2060955414 From jpai at openjdk.org Wed Apr 17 10:45:12 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 17 Apr 2024 10:45:12 GMT Subject: RFR: Merge 33d7127 [v2] In-Reply-To: References: Message-ID: > This brings in the CPU24_04 changes. Jaikiran Pai has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18807/files - new: https://git.openjdk.org/jdk/pull/18807/files/33d71275..33d71275 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18807&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18807&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18807.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18807/head:pull/18807 PR: https://git.openjdk.org/jdk/pull/18807 From jpai at openjdk.org Wed Apr 17 10:45:13 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 17 Apr 2024 10:45:13 GMT Subject: Integrated: Merge 33d7127 In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 01:09:30 GMT, Jaikiran Pai wrote: > This brings in the CPU24_04 changes. This pull request has now been integrated. Changeset: d2f9a1eb Author: Jaikiran Pai URL: https://git.openjdk.org/jdk/commit/d2f9a1eb9709dbd8b1e7b0d1c14b7876281d7f23 Stats: 182 lines in 19 files changed: 58 ins; 51 del; 73 mod Merge Reviewed-by: dfuchs ------------- PR: https://git.openjdk.org/jdk/pull/18807 From jkern at openjdk.org Wed Apr 17 10:56:49 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 17 Apr 2024 10:56:49 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> Message-ID: <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> On Tue, 16 Apr 2024 09:15:19 GMT, Magnus Ihse Bursie wrote: >> That was kind of where the discussion started, and which Kim did not like. If I read him correctly, his suggestion was instead to place: >> >> #if defined(_AIX) >> #include >> #endif >> >> in the files where `alloca` is needed on AIX. > > (If some of these files happen to be files which are not compiled on Windows, I assume it will not hurt to drop the ifdef guard, but then again, it can certainly be kept as well for consistency.) @magicus @TheShermanTanker @TheRealMDoerr @kimbarrett Let me summarize the choices we have and ask for your vote. Julian dislikes the `-Dalloca'(size)'=__builtin_alloca'(size)'` in `flags-cflags.m4` I introduced to get rid of #if defined(_AIX) #include #endif in `globalDefinitions_gcc.hpp`. We have three possible solutions 1. Reintroduce #if defined(_AIX) #include #endif in `globalDefinitions_gcc.hpp`. 2. Unconditionally introduce only `#include ` in `globalDefinitions_gcc.hpp`. This should work for all platforms using this header including the unofficial Windows/gcc Port, although only AIX needs it. 3. Add #if defined(_AIX) #include #endif to the sources using alloca(). These are /hotspot/share/runtime/os.cpp /hotspot/share/runtime/javaThread.cpp /hotspot/share/utilities/vmError.cpp Here we need the AIX condition, because otherwise the classic Windows Build (NTAMD64) fails. I will implement the solution with the most likes and having no dislike. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1568650313 From lucy at openjdk.org Wed Apr 17 11:17:42 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 17 Apr 2024 11:17:42 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" Program logic is changed. You should at least run tests with DiagnoseSyncOnValueBasedClasses != 0 to verify the code still does what it is supposed to do. GHAs don't help here because the change is s390-specific. A second review would be good as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2061019296 From stefank at openjdk.org Wed Apr 17 11:21:48 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 17 Apr 2024 11:21:48 GMT Subject: RFR: 8326957: Implementation of JEP 474: ZGC: Generational Mode by Default [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 08:47:08 GMT, Axel Boldt-Christmas wrote: >> This is the implementation task for `JEP 474: ZGC: Generational Mode by Default`. See the JEP for details. [JDK-8326667](https://bugs.openjdk.org/browse/JDK-8326667) > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Remove extra space > - Use consistent terminology Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18393#pullrequestreview-2005772025 From jwaters at openjdk.org Wed Apr 17 11:55:48 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 17 Apr 2024 11:55:48 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> Message-ID: On Wed, 17 Apr 2024 10:54:25 GMT, Joachim Kern wrote: >> (If some of these files happen to be files which are not compiled on Windows, I assume it will not hurt to drop the ifdef guard, but then again, it can certainly be kept as well for consistency.) > > @magicus @TheShermanTanker @TheRealMDoerr @kimbarrett > Let me summarize the choices we have and ask for your vote. > Julian dislikes the `-Dalloca'(size)'=__builtin_alloca'(size)'` in `flags-cflags.m4` I introduced to get rid of > > #if defined(_AIX) > #include > #endif > > in `globalDefinitions_gcc.hpp`. > > We have three possible solutions > > 1. Reintroduce > > #if defined(_AIX) > #include > #endif > > in `globalDefinitions_gcc.hpp`. > > 2. Unconditionally introduce only `#include ` in `globalDefinitions_gcc.hpp`. This should work for all platforms using this header including the unofficial Windows/gcc Port, although only AIX needs it. > > 3. Add > > #if defined(_AIX) > #include > #endif > > to the sources using alloca(). These are > /hotspot/share/runtime/os.cpp > /hotspot/share/runtime/javaThread.cpp > /hotspot/share/utilities/vmError.cpp > Here we need the AIX condition, because otherwise the classic Windows Build (NTAMD64) fails. > > I will implement the solution with the most likes and having no dislike. I don't mind all 3, though I certainly prefer 1 and 3 over 2 (The way I see it, the AIX macro check is more of a message to the programmer than it is important to the compiler, so I prefer the options that have it. However, I also don't mind if we were to go the way of option 2, this is more of a preference thing). The fact that only 3 files need it is also surprising to me, and makes option 3 seem like a good fit (Again, personal preference) Magnus and Kim, what do you guys think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1568718288 From ihse at openjdk.org Wed Apr 17 12:25:04 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 17 Apr 2024 12:25:04 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> Message-ID: <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> On Wed, 17 Apr 2024 11:52:42 GMT, Julian Waters wrote: >> @magicus @TheShermanTanker @TheRealMDoerr @kimbarrett >> Let me summarize the choices we have and ask for your vote. >> Julian dislikes the `-Dalloca'(size)'=__builtin_alloca'(size)'` in `flags-cflags.m4` I introduced to get rid of >> >> #if defined(_AIX) >> #include >> #endif >> >> in `globalDefinitions_gcc.hpp`. >> >> We have three possible solutions >> >> 1. Reintroduce >> >> #if defined(_AIX) >> #include >> #endif >> >> in `globalDefinitions_gcc.hpp`. >> >> 2. Unconditionally introduce only `#include ` in `globalDefinitions_gcc.hpp`. This should work for all platforms using this header including the unofficial Windows/gcc Port, although only AIX needs it. >> >> 3. Add >> >> #if defined(_AIX) >> #include >> #endif >> >> to the sources using alloca(). These are >> /hotspot/share/runtime/os.cpp >> /hotspot/share/runtime/javaThread.cpp >> /hotspot/share/utilities/vmError.cpp >> Here we need the AIX condition, because otherwise the classic Windows Build (NTAMD64) fails. >> >> I will implement the solution with the most likes and having no dislike. > > I don't mind all 3, though I certainly prefer 1 and 3 over 2 (The way I see it, the AIX macro check is more of a message to the programmer than it is important to the compiler, so I prefer the options that have it. However, I also don't mind if we were to go the way of option 2, this is more of a preference thing). The fact that only 3 files need it is also surprising to me, and makes option 3 seem like a good fit (Again, personal preference) > > Magnus and Kim, what do you guys think? If there are just 3 files using alloca, I strongly prefer solution 3. I think solution 1 has already been rejected by Kim. (Also, for the record, it was me, not Julian, who expressed dislike about the `-Dalloca'(size)'=__builtin_alloca'(size)'` change) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1568754458 From tstuefe at redhat.com Wed Apr 17 12:46:23 2024 From: tstuefe at redhat.com (Thomas Stuefe) Date: Wed, 17 Apr 2024 14:46:23 +0200 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes (obviously) On Thu, Apr 11, 2024 at 3:24?PM Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He > has been a contributor since the early days of OpenJDK. > > The history of his contributions has been mangled by various SCM moves and > repo consolidations over the years [1], but he was one of the original > authors of the arm64 port ([2] shows 359 changes in the mercurial hotspot > sub repository alone), contributed JEP 352 (support for NVM devices under > byte buffers), and more recently has been active in the Graal and the > Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on this > nomination. Votes must be cast in the open by replying to this mailing > list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1] https://github.com/openjdk/jdk/commits/master/?author=adinn > [2] https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > [3] https://openjdk.org/census#members > [4] https://openjdk.org/groups/#member-vote > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdoerr at openjdk.org Wed Apr 17 13:05:02 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 17 Apr 2024 13:05:02 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> Message-ID: On Wed, 17 Apr 2024 12:22:10 GMT, Magnus Ihse Bursie wrote: >> I don't mind all 3, though I certainly prefer 1 and 3 over 2 (The way I see it, the AIX macro check is more of a message to the programmer than it is important to the compiler, so I prefer the options that have it. However, I also don't mind if we were to go the way of option 2, this is more of a preference thing). The fact that only 3 files need it is also surprising to me, and makes option 3 seem like a good fit (Again, personal preference) >> >> Magnus and Kim, what do you guys think? > > If there are just 3 files using alloca, I strongly prefer solution 3. I think solution 1 has already been rejected by Kim. > > (Also, for the record, it was me, not Julian, who expressed dislike about the `-Dalloca'(size)'=__builtin_alloca'(size)'` change) https://man7.org/linux/man-pages/man3/alloca.3.html sounds like solution 2 is the cleanest one ("standards conformance"). It is also the version with minimal code and which will even work with future alloca usages :-) If solution 2 has any disadvantage, I'd prefer solution 3. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1568809674 From stefank at openjdk.org Wed Apr 17 13:08:06 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 17 Apr 2024 13:08:06 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> On Mon, 15 Apr 2024 16:11:13 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > alignment in coding style changed. Here's a new set of comments. src/hotspot/os/windows/os_windows.cpp line 5110: > 5108: > 5109: // Record virtual memory allocation > 5110: MemTracker::record_virtual_memory_reserve_and_commit((address)addr, bytes, CALLER_PC, flag); Should this really be called here? The posix version don't call this, so I don't understand why it is called here in the Windows code. src/hotspot/share/cds/filemap.cpp line 1697: > 1695: static char* map_memory(int fd, const char* file_name, size_t file_offset, > 1696: char *addr, size_t bytes, bool read_only, > 1697: bool allow_exec, MEMFLAGS flags) { It is odd that `map_memory` and `os::map_memory` has different parameter order. I understand that this is done because of default values, but I'd like to suggest that you get rid of these default values and fix the order. (Side-note: Wouldn't it be better to rename this `map_memory` to something that clearly shows the difference between this function and `os::map_memory`) src/hotspot/share/classfile/compactHashtable.cpp line 243: > 241: quit("Unable to open hashtable dump file", filename); > 242: } > 243: _base = os::map_memory(_fd, filename, 0, nullptr, _size, mtInternal, true, false); Isn't this CDS code. Should ths be mtClassShared or something else that indicates that this is CDS code? src/hotspot/share/nmt/virtualMemoryTracker.cpp line 460: > 458: assert(_reserved_regions != nullptr, "Sanity check"); > 459: > 460: ReservedMemoryRegion rgn(addr, size, flag); I'm not sure about this. `rgn` is just used to find the memory region we want to uncommit. The flag isn't used in the search, and passing it forces the callers to also pass in the flag. I understand that this happens after the request to remove the mtNone default value. Is there a way that allows us to skip using mtNone, but still don't have to unnecessarily provide a flag? Maybe we could create a helper function `ReservedMemoryRegion rgn = ReservedMemoryRegion::create_find_key(addr, size)`, which sets up a ReserveMemoryRegion with mtNone? src/hotspot/share/runtime/os.cpp line 1817: > 1815: > 1816: char* os::reserve_memory(size_t bytes, bool executable, MEMFLAGS flags) { > 1817: char* result = pd_reserve_memory(bytes, executable, flags); Doesn't it look weird that we pass in flags here and then still call MemTracker::record_ below? I think this is an artifact from mixing if we put the NMT calls in shared or in platform dependent code. I understand that you need this for this patch, but I also think there needs to be some RFE to figure out if this can be reworked. src/hotspot/share/runtime/os.cpp line 2187: > 2185: MEMFLAGS flags, > 2186: bool read_only, > 2187: bool allow_exec) { The function was written with multiple parameters per line here, and then you changed it so that some of the params where placed on individual lines. This should likely be reverted. src/hotspot/share/runtime/os.hpp line 233: > 231: char *addr, size_t bytes, > 232: MEMFLAGS flag, > 233: bool read_only = false, Mixes param layout style. (Plus earlier comment that the default values should probably be removed so that MEMFLAGS can be put last). src/hotspot/share/runtime/os.hpp line 471: > 469: // vm_exit_out_of_memory() with the specified mesg. > 470: static void commit_memory_or_exit(char* addr, size_t bytes, > 471: bool executable, const char* mesg, MEMFLAGS flag); I think that we should change the parameter order here, so that it is like `commit_memory` and then the extra mesg param goes with the `_or_exit` part (if that makes sense). Suggestion: bool executable, MEMFLAGS flag, const char* mesg); src/hotspot/share/runtime/os.hpp line 522: > 520: MEMFLAGS flag, > 521: bool read_only = false, > 522: bool allow_exec = false); params layout style. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2005841897 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568730671 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568735525 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568722870 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568775270 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568802391 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568808811 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568810492 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568726202 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1568812091 From mbaesken at openjdk.org Wed Apr 17 13:16:09 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 17 Apr 2024 13:16:09 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v7] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 05:44:48 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: > > - Merge branch 'master' into patch-7 > - Require clang 13 in toolchain.m4 > - Remove unnecessary -std=c++17 option in Lib.gmk > - Merge branch 'openjdk:master' into patch-7 > - Compiler versions in toolchain.m4 > - Merge branch 'openjdk:master' into patch-7 > - Merge branch 'openjdk:master' into patch-7 > - Revert vm_version_linux_riscv.cpp > - vm_version_linux_riscv.cpp > - allocation.cpp > - ... and 1 more: https://git.openjdk.org/jdk/compare/269163d5...9286a964 Seems we use already a little bit of C++17 coding in the Linux codebase . Just came across this little error when trying to build with clang on Linux jdk/src/hotspot/os/linux/os_linux.cpp:2975:65: error: 'static_assert' with no message is a C++17 extension [-Werror,-Wc++17-extensions] static_assert(MADV_POPULATE_WRITE == MADV_POPULATE_WRITE_value); So switching to C++17 would make our codebase compile :-) (at least in this case) ! (to be more serious, I guess I better file a JBS bug and post a PR/fix for this static_assert) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-2061233028 From jsjolen at openjdk.org Wed Apr 17 14:07:02 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 17 Apr 2024 14:07:02 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v38] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Rename variable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/d29d4dc0..8e3fd751 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=37 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=36-37 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From kbarrett at openjdk.org Wed Apr 17 15:13:04 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 17 Apr 2024 15:13:04 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> Message-ID: <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@github.com> On Wed, 17 Apr 2024 13:02:33 GMT, Martin Doerr wrote: >> If there are just 3 files using alloca, I strongly prefer solution 3. I think solution 1 has already been rejected by Kim. >> >> (Also, for the record, it was me, not Julian, who expressed dislike about the `-Dalloca'(size)'=__builtin_alloca'(size)'` change) > > https://man7.org/linux/man-pages/man3/alloca.3.html sounds like solution 2 is the cleanest one ("standards conformance"). It is also the version with minimal code and which will even work with future alloca usages :-) > If solution 2 has any disadvantage, I'd prefer solution 3. I'm aware of this discussion and looking into the issues, but a personal matter has intervened and it will take me a while to respond properly. Maybe next week. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1569012102 From pchilanomate at openjdk.org Wed Apr 17 16:21:04 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 17 Apr 2024 16:21:04 GMT Subject: Integrated: 8325469: Freeze/Thaw code can crash in the presence of OSR frames In-Reply-To: References: Message-ID: On Thu, 4 Apr 2024 19:52:18 GMT, Patricio Chilano Mateo wrote: > Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored?inside the frame before calling OSR_migration_end(). > > This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments. > > When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments. > > I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not. > > The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments. > The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR. > > I tested the current patch with the new test and also run it through mach5 tiers1-6. > > Thanks, > Patricio This pull request has now been integrated. Changeset: fd331ff1 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/fd331ff17330329a656181cb58714f1bd1623fcb Stats: 270 lines in 16 files changed: 246 ins; 5 del; 19 mod 8325469: Freeze/Thaw code can crash in the presence of OSR frames Reviewed-by: rpressler, dlong ------------- PR: https://git.openjdk.org/jdk/pull/18637 From pchilanomate at openjdk.org Wed Apr 17 16:21:04 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 17 Apr 2024 16:21:04 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 18:06:33 GMT, Ron Pressler wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> use WhiteBox to verify OSR compilation > > It may be hard to do a proper measurement because the number of methods in our microbenchmarks is small. We're also talking an extra branch, I think. This is code than can be called a million times per second per core. It's very performance sensitive. So I would prefer to first see if there's an impact on nmethod size, and only if there is consider whether the speed implications are acceptable. Thanks for the reviews @pron and @dean-long! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2061695741 From syan at openjdk.org Wed Apr 17 16:22:02 2024 From: syan at openjdk.org (SendaoYan) Date: Wed, 17 Apr 2024 16:22:02 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 [v5] In-Reply-To: <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> <5CfWlPKTQYn_C-qfExFXR94T9JT3jO78qvXZZm3vFYk=.c9f2152a-a45d-4b20-bb2c-eef314ed53eb@github.com> Message-ID: On Fri, 12 Apr 2024 15:27:52 GMT, SendaoYan wrote: >> Hi, >> >> According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. >> >> When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are three methods: >> >> 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` >> 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. >> 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. >> >> Maybe the 2rd method seems more resonable. >> >> >> Thanks, >> -sendao > > SendaoYan has updated the pull request incrementally with one additional commit since the last revision: > > 1. if (isCgroupV1) only contains opts.addDockerOpts("--memory-swappiness=60"); 2. delete extra space at the beginning of the line in testSwapMemory > /sponsor Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-2061698474 From dcubed at openjdk.org Wed Apr 17 16:49:43 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 17 Apr 2024 16:49:43 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 05:41:26 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. >> >> Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. >> >> Running JVMTI reproducer. Starting full testing soon. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Handle previous bc being monitorenter > - Remove implicit conditions Please clarify what pre-integration testing is being done. As far as I can tell, this failure only shows up in Tier8 so that should be part of your mix. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2061745571 From dcubed at openjdk.org Wed Apr 17 17:02:04 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 17 Apr 2024 17:02:04 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 05:41:26 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. >> >> Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. >> >> Running JVMTI reproducer. Starting full testing soon. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Handle previous bc being monitorenter > - Remove implicit conditions Changes requested by dcubed (Reviewer). src/hotspot/share/runtime/deoptimization.cpp line 443: > 441: } > 442: #ifdef ASSERT > 443: if (LockingMode == LM_LIGHTWEIGHT && !realloc_failures) { In the new code, you are no longer account for `realloc_failures` being true. I'm not convinced that is okay here. src/hotspot/share/runtime/deoptimization.cpp line 449: > 447: : chunk->first()->method()->code_at(bci - 1) != Bytecodes::_monitorenter; > 448: const bool is_syncronized_entry = chunk->first()->method()->is_synchronized() && > 449: chunk->first()->raw_bci() == SynchronizationEntryBCI; nit typo: s/is_syncronized_entry/is_synchronized_entry/ src/hotspot/share/runtime/deoptimization.cpp line 450: > 448: const bool is_syncronized_entry = chunk->first()->method()->is_synchronized() && > 449: chunk->first()->raw_bci() == SynchronizationEntryBCI; > 450: // If deoptimizing from monitorenter bytecode we maybe in transitional state. Skip verification. nit typo: s/we maybe/we may be/ nit typo: s/in transitional state/is a transitional state/ src/hotspot/share/runtime/deoptimization.cpp line 451: > 449: chunk->first()->raw_bci() == SynchronizationEntryBCI; > 450: // If deoptimizing from monitorenter bytecode we maybe in transitional state. Skip verification. > 451: // When reexecuting the current bc, the previous bc may not have finished yet. Should this: `... the previous bc may not have finished yet.` be: `... the previous monitorenter bc may not have finished yet.` ------------- PR Review: https://git.openjdk.org/jdk/pull/18782#pullrequestreview-2006612177 PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1569166375 PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1569159605 PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1569160377 PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1569168751 From duke at openjdk.org Wed Apr 17 17:04:11 2024 From: duke at openjdk.org (duke) Date: Wed, 17 Apr 2024 17:04:11 GMT Subject: Withdrawn: 8323497: On x64, use 32-bit immediate moves for narrow klass base if possible In-Reply-To: References: Message-ID: <9NZ2wa6ysxniyg-ZJlcRgsW4xM5C_3KTfBhFEoTw428=.7be68161-265a-4dbe-b859-6223b50320a8@github.com> On Wed, 10 Jan 2024 09:09:50 GMT, Thomas Stuefe wrote: > On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes. > > Before: mov uses 10 instruction bytes: > > > 35 ;; decode_klass_not_null > 36 0x00007f8b089e51c4: movabs $0x82000000,%r11 > 37 0x00007f8b089e51ce: add %r11,%r10 > > > Now: mov uses 6 instruction bytes: > > > 35 ;; decode_klass_not_null > 36 0x00007fbe609e51c4: mov $0x82000000,%r11d > 37 0x00007fbe609e51ca: add %r11,%r10 > > > Note that this optimization does not depend on zero-based addressing, and therefore we change class space reservation: we now always look in low-address regions first. > > ---------- > > Tests: tier1 (GHA), tier 2 on x64 linux This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/17340 From cslucas at openjdk.org Wed Apr 17 17:19:01 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 17 Apr 2024 17:19:01 GMT Subject: RFR: 8329433: Reduce nmethod header size [v6] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:56:33 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - remove trailing space > - Shuffle fields initialization src/hotspot/share/code/nmethod.hpp line 259: > 257: int _orig_pc_offset; > 258: > 259: int _compile_id; // which compilation made this nmethod NIT: are these fields always needed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1569185473 From kvn at openjdk.org Wed Apr 17 17:53:08 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 17:53:08 GMT Subject: RFR: 8329433: Reduce nmethod header size [v6] In-Reply-To: References: Message-ID: <9FJAaJ67TpV0_rGDkLEmyJf3I3K7yF61rapAIg6Znrk=.fbd4cb50-833e-4588-8463-69d2c38fb48e@github.com> On Wed, 17 Apr 2024 17:10:50 GMT, Cesar Soares Lucas wrote: >> Vladimir Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - remove trailing space >> - Shuffle fields initialization > > src/hotspot/share/code/nmethod.hpp line 259: > >> 257: int _orig_pc_offset; >> 258: >> 259: int _compile_id; // which compilation made this nmethod > > NIT: are these fields always needed? Yes, they are needed for debugging issues. They are important for error reporting, logs and events recording. And they do not take much space: CompLevel and CompilerType are one byte size. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1569237659 From cslucas at openjdk.org Wed Apr 17 18:06:02 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 17 Apr 2024 18:06:02 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v5] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 17:54:25 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8329728-read-arbitrary-long-lines-in-class-list-parser > - @dholmes-ora and @calvinccheung comments > - Check class name for valid UTF8 encoding > - @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > - 8329728: Read arbitrarily long lines in ClassListParser src/hotspot/share/utilities/lineReader.cpp line 44: > 42: } > 43: > 44: void LineReader::init(FILE* file) { NIT: why create this `init` method with same parameters and behavior of one of the constructors? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18669#discussion_r1569256819 From heidinga at openjdk.org Wed Apr 17 18:46:51 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Wed, 17 Apr 2024 18:46:51 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` Message-ID: Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. ------------- Commit messages: - 8320522: Remove code related to `RegisterFinalizersAtInit` Changes: https://git.openjdk.org/jdk/pull/18823/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320522 Stats: 16 lines in 8 files changed: 0 ins; 11 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18823/head:pull/18823 PR: https://git.openjdk.org/jdk/pull/18823 From matsaave at openjdk.org Wed Apr 17 18:53:25 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 17 Apr 2024 18:53:25 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding Message-ID: Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. ------------- Commit messages: - 8330388: Remove invokedynamic cache index encoding Changes: https://git.openjdk.org/jdk/pull/18819/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18819&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330388 Stats: 220 lines in 37 files changed: 15 ins; 136 del; 69 mod Patch: https://git.openjdk.org/jdk/pull/18819.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18819/head:pull/18819 PR: https://git.openjdk.org/jdk/pull/18819 From ccheung at openjdk.org Wed Apr 17 18:58:09 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 17 Apr 2024 18:58:09 GMT Subject: RFR: 8330198: Add some class loading related perf counters to measure VM startup Message-ID: Adding a few perf counters related to class loading to measure VM startup. The counters are only active if the user specifies `-Xlog:init` in the command line. A diagnostic flag `ProfileClassLinkage` is added to control the new counters. The flag is set to false by default and will be enabled if `-Xlog:init` is specified. This change is already in the leyden/premain branch. There are more counters in the branch to measure other stuff. For now, just upstreaming class loader related counters. Refer to the [comment](https://bugs.openjdk.org/browse/JDK-8330198?focusedId=14665311&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14665311) in the bug report for an example output. Passed tiers 1 - 4 testing. ------------- Commit messages: - fix build issues on macos-x64 and -aarch64 - Merge branch 'master' into xloginit-classloading - fix linux-x86 and minimal build issues - 8330198: Add some class loading related perf counters to measure VM startup Changes: https://git.openjdk.org/jdk/pull/18790/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18790&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330198 Stats: 179 lines in 15 files changed: 158 ins; 6 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18790.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18790/head:pull/18790 PR: https://git.openjdk.org/jdk/pull/18790 From coleenp at openjdk.org Wed Apr 17 20:11:59 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Apr 2024 20:11:59 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 18:42:29 GMT, Dan Heidinga wrote: > Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. > > `make test-tier1` passed with this change src/hotspot/share/classfile/classFileParser.cpp line 4191: > 4189: // See documentation of InstanceKlass::can_be_fastpath_allocated(). > 4190: assert(ik->size_helper() > 0, "layout_helper is initialized"); > 4191: if (ik->is_abstract() || ik->is_interface() TIL what this bit was. There are some comments in the interpreters that claim that the bit will be set if there are finalizers, which is no longer true. The comments look like: // test to see if it has a finalizer or is malformed in some way src/hotspot/share/oops/instanceKlass.cpp line 1517: > 1515: instanceOop i; > 1516: > 1517: i = (instanceOop)Universe::heap()->obj_allocate(this, size, CHECK_NULL); You could change to return without 'i'. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18823#discussion_r1569459880 PR Review Comment: https://git.openjdk.org/jdk/pull/18823#discussion_r1569420317 From kvn at openjdk.org Wed Apr 17 20:21:49 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 20:21:49 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 16:17:39 GMT, Patricio Chilano Mateo wrote: >> It may be hard to do a proper measurement because the number of methods in our microbenchmarks is small. We're also talking an extra branch, I think. This is code than can be called a million times per second per core. It's very performance sensitive. So I would prefer to first see if there's an impact on nmethod size, and only if there is consider whether the speed implications are acceptable. > > Thanks for the reviews @pron and @dean-long! Hi @pchilano This change did affect my PR which try to reduce `nmethod` header size [#18768](https://github.com/openjdk/jdk/pull/18768). I am fine with caching the value in `nmethod` but why you used `int` field for it? It is `u2` in [constMethod.hpp#L209](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/constMethod.hpp#L209). I am currently resolving conflict in my PR with your changes and I am planning to use `u2` for it in `nmethod` too. Are you okay with that? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2062147495 From heidinga at openjdk.org Wed Apr 17 20:30:27 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Wed, 17 Apr 2024 20:30:27 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v2] In-Reply-To: References: Message-ID: > Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. > > `make test-tier1` passed with this change Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18823/files - new: https://git.openjdk.org/jdk/pull/18823/files/93e154e8..b370cd1f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=00-01 Stats: 11 lines in 5 files changed: 0 ins; 5 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18823/head:pull/18823 PR: https://git.openjdk.org/jdk/pull/18823 From dlong at openjdk.org Wed Apr 17 20:31:03 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 17 Apr 2024 20:31:03 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 16:09:21 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/code/nmethod.cpp line 1441: >> >>> 1439: int deps_size = align_up((int)dependencies->size_in_bytes(), oopSize); >>> 1440: int sum_size = oops_size + metadata_size + deps_size; >>> 1441: assert((sum_size >> 16) == 0, "data size is bigger than 64Kb: %d", sum_size); >> >> I suggest using checked_cast for the assignment below, rather than special-purpose checks here. > > Okay. But I will put above code under `#ifdef ASSERT` then. The ASSERT block above looks unnecessary, now that field assignments below are using checked_cast. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1569496753 From pchilanomate at openjdk.org Wed Apr 17 20:36:03 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 17 Apr 2024 20:36:03 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 16:17:39 GMT, Patricio Chilano Mateo wrote: >> It may be hard to do a proper measurement because the number of methods in our microbenchmarks is small. We're also talking an extra branch, I think. This is code than can be called a million times per second per core. It's very performance sensitive. So I would prefer to first see if there's an impact on nmethod size, and only if there is consider whether the speed implications are acceptable. > > Thanks for the reviews @pron and @dean-long! > Hi @pchilano > > This change did affect my PR which try to reduce `nmethod` header size [#18768](https://github.com/openjdk/jdk/pull/18768). > > I am fine with caching the value in `nmethod` but why you used `int` field for it? It is `u2` in [constMethod.hpp#L209](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/constMethod.hpp#L209). > > I am currently resolving conflict in my PR with your changes and I am planning to use `u2` for it in `nmethod` too. Are you okay with that? > Yes. I just used int because that was the return value of num_stack_arg_slots() that I moved from method.hpp, but I missed the field can just be defined as a u2 instead. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2062199420 From heidinga at openjdk.org Wed Apr 17 20:38:34 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Wed, 17 Apr 2024 20:38:34 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v3] In-Reply-To: References: Message-ID: > Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. > > `make test-tier1` passed with this change Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: Update ppc template comment too ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18823/files - new: https://git.openjdk.org/jdk/pull/18823/files/b370cd1f..eb0ba81f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18823/head:pull/18823 PR: https://git.openjdk.org/jdk/pull/18823 From heidinga at openjdk.org Wed Apr 17 20:38:34 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Wed, 17 Apr 2024 20:38:34 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 20:08:02 GMT, Coleen Phillimore wrote: >> Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: >> >> Update ppc template comment too > > src/hotspot/share/classfile/classFileParser.cpp line 4191: > >> 4189: // See documentation of InstanceKlass::can_be_fastpath_allocated(). >> 4190: assert(ik->size_helper() > 0, "layout_helper is initialized"); >> 4191: if (ik->is_abstract() || ik->is_interface() > > TIL what this bit was. There are some comments in the interpreters that claim that the bit will be set if there are finalizers, which is no longer true. The comments look like: > > // test to see if it has a finalizer or is malformed in some way Thanks. I never would have found that! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18823#discussion_r1569506975 From ihse at openjdk.org Wed Apr 17 20:53:03 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 17 Apr 2024 20:53:03 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Wed, 17 Apr 2024 15:09:50 GMT, Kim Barrett wrote: >> https://man7.org/linux/man-pages/man3/alloca.3.html sounds like solution 2 is the cleanest one ("standards conformance"). It is also the version with minimal code and which will even work with future alloca usages :-) >> If solution 2 has any disadvantage, I'd prefer solution 3. > > I'm aware of this discussion and looking into the issues, but a personal matter has intervened and it will take > me a while to respond properly. Maybe next week. I opened https://bugs.openjdk.org/browse/JDK-8330539 so we don't lose track of this, but we can keep the discussion/voting here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1569528717 From kvn at openjdk.org Wed Apr 17 21:04:02 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 21:04:02 GMT Subject: RFR: 8325469: Freeze/Thaw code can crash in the presence of OSR frames [v3] In-Reply-To: References: Message-ID: <-QkgHg1o08-E48-4UzcfzNfbJX41RNvgbncchkM90no=.069533f5-5bfd-4db0-b4f3-633567baf094@github.com> On Wed, 17 Apr 2024 20:32:37 GMT, Patricio Chilano Mateo wrote: > Yes. I just used int because that was the return value of num_stack_arg_slots() that I moved from method.hpp, but I missed the field can just be defined as a u2 instead. Okay. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18637#issuecomment-2062311988 From cjplummer at openjdk.org Wed Apr 17 21:16:11 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 17 Apr 2024 21:16:11 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. SA changes look good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18819#pullrequestreview-2007176260 From kvn at openjdk.org Wed Apr 17 21:16:48 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 21:16:48 GMT Subject: RFR: 8329433: Reduce nmethod header size [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 20:27:53 GMT, Dean Long wrote: >> Okay. But I will put above code under `#ifdef ASSERT` then. > > The ASSERT block above looks unnecessary, now that field assignments below are using checked_cast. Agree, but I need to change how I use checked_cast below to get the same check as above. I will do it in next update: _dependencies_offset = _metadata_offset + checked_cast(align_up(code_buffer->total_metadata_size(), wordSize)); _scopes_pcs_offset = _dependencies_offset + checked_cast(align_up((int)dependencies->size_in_bytes(), oopSize)); --- _dependencies_offset = checked_cast(_metadata_offset + align_up(code_buffer->total_metadata_size(), wordSize)); _scopes_pcs_offset = checked_cast(_dependencies_offset + align_up((int)dependencies->size_in_bytes(), oopSize)); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18768#discussion_r1569563003 From iklam at openjdk.org Wed Apr 17 21:35:22 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 17 Apr 2024 21:35:22 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum Message-ID: `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: This simple c++ file illustrates the problem: enum class CompileCommand { a, b, c }; void foo(CompileCommand x) {} char* CompileCommand; // can no longer use "CompileCommand" as a type void good(enum CompileCommand x) {} void bad(CompileCommand x) {} $ g++ -c ~/enum.cpp /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void 5 | void bad(CompileCommand x) {} The fix is to rename the enum type to `CompileCommandEnum`. This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. ------------- Commit messages: - 8330540: Rename the enum type CompileCommand to CompileCommandEnum Changes: https://git.openjdk.org/jdk/pull/18829/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18829&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330540 Stats: 138 lines in 19 files changed: 11 ins; 4 del; 123 mod Patch: https://git.openjdk.org/jdk/pull/18829.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18829/head:pull/18829 PR: https://git.openjdk.org/jdk/pull/18829 From kvn at openjdk.org Wed Apr 17 21:52:56 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 21:52:56 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum In-Reply-To: References: Message-ID: <2Ua-oEbPXPKGGaenm51PzUhknCYhXkGcPoI-w_tGQqQ=.5377a48a-0e44-4f28-9ce9-c2209ada4f87@github.com> On Wed, 17 Apr 2024 21:26:06 GMT, Ioi Lam wrote: > `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). > > This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: > > This simple c++ file illustrates the problem: > > enum class CompileCommand { a, b, c }; > void foo(CompileCommand x) {} > char* CompileCommand; // can no longer use "CompileCommand" as a type > void good(enum CompileCommand x) {} > void bad(CompileCommand x) {} > > $ g++ -c ~/enum.cpp > /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void > 5 | void bad(CompileCommand x) {} > > > The fix is to rename the enum type to `CompileCommandEnum`. > > This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 34: > 32: #include "compiler/compileBroker.hpp" > 33: #include "compiler/compilerEvent.hpp" > 34: #include "compiler/compilerOracle.hpp" Why you need it here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18829#discussion_r1569584842 From coleenp at openjdk.org Wed Apr 17 22:20:58 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 17 Apr 2024 22:20:58 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 20:38:34 GMT, Dan Heidinga wrote: >> Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. >> >> `make test-tier1` passed with this change > > Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: > > Update ppc template comment too Looks good! abstractInterpreter.cpp needs a copyright update. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18823#pullrequestreview-2007385274 From kvn at openjdk.org Wed Apr 17 22:23:47 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 22:23:47 GMT Subject: RFR: 8329433: Reduce nmethod header size [v7] In-Reply-To: References: Message-ID: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1322 bytecoded nmethods for C2: > total in heap = 8307120 (100%) > header = 327856 (3.946687%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge master - remove trailing space - Shuffle fields initialization - Address comments. Used checked_cast. - Use 16-bits types for header_size and frame_complete_offset arguments - Union fields which usages do not overlap - Moved some fields initialization into init_defaults() - 8329433: Reduce nmethod header size ------------- Changes: https://git.openjdk.org/jdk/pull/18768/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=06 Stats: 528 lines in 15 files changed: 140 ins; 178 del; 210 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From iklam at openjdk.org Wed Apr 17 22:37:00 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 17 Apr 2024 22:37:00 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum In-Reply-To: <2Ua-oEbPXPKGGaenm51PzUhknCYhXkGcPoI-w_tGQqQ=.5377a48a-0e44-4f28-9ce9-c2209ada4f87@github.com> References: <2Ua-oEbPXPKGGaenm51PzUhknCYhXkGcPoI-w_tGQqQ=.5377a48a-0e44-4f28-9ce9-c2209ada4f87@github.com> Message-ID: On Wed, 17 Apr 2024 21:40:49 GMT, Vladimir Kozlov wrote: >> `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). >> >> This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: >> >> This simple c++ file illustrates the problem: >> >> enum class CompileCommand { a, b, c }; >> void foo(CompileCommand x) {} >> char* CompileCommand; // can no longer use "CompileCommand" as a type >> void good(enum CompileCommand x) {} >> void bad(CompileCommand x) {} >> >> $ g++ -c ~/enum.cpp >> /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void >> 5 | void bad(CompileCommand x) {} >> >> >> The fix is to rename the enum type to `CompileCommandEnum`. >> >> This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. > > src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 34: > >> 32: #include "compiler/compileBroker.hpp" >> 33: #include "compiler/compilerEvent.hpp" >> 34: #include "compiler/compilerOracle.hpp" > > Why you need it here? It's needed by this: jvmciCompilerToVM.cpp:571:21: error: 'CompilerOracle' has not been declared 571 | return !Inline || CompilerOracle::should_not_inline(method) || method->dont_inline(); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18829#discussion_r1569621139 From dlong at openjdk.org Wed Apr 17 22:47:56 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 17 Apr 2024 22:47:56 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. src/hotspot/share/ci/ciEnv.cpp line 1513: > 1511: // process the BSM > 1512: int pool_index = indy_info->constant_pool_index(); > 1513: BootstrapInfo bootstrap_specifier(cp, pool_index, indy_index); Why not just change the incoming parameter name to `index`? src/hotspot/share/classfile/resolutionErrors.hpp line 60: > 58: > 59: // This function is used to encode an invokedynamic index to differentiate it from a > 60: // constant pool index. It assumes it is being called with a index that is less than 0 Is this comment still correct? src/hotspot/share/interpreter/bootstrapInfo.cpp line 77: > 75: return true; > 76: } else if (indy_entry->resolution_failed()) { > 77: int encoded_index = ResolutionErrorTable::encode_indy_index(_indy_index); That's an improvement, from two levels of encoding to only one! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18819#discussion_r1569625123 PR Review Comment: https://git.openjdk.org/jdk/pull/18819#discussion_r1569626519 PR Review Comment: https://git.openjdk.org/jdk/pull/18819#discussion_r1569627219 From dlong at openjdk.org Wed Apr 17 22:51:07 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 17 Apr 2024 22:51:07 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: <-5i_BDguO1qWOP0GnYK4pTeMMW4IhlV3LkqLPFs4vAw=.060c849a-de1a-4888-943e-80b9ed4eecf2@github.com> On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. Did you consider minimizing changes by leaving decode_invokedynamic_index/encode_invokedynamic_index calls in place, but having the implementations not change the value? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18819#issuecomment-2062609288 From kvn at openjdk.org Wed Apr 17 22:53:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 17 Apr 2024 22:53:01 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 21:26:06 GMT, Ioi Lam wrote: > `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). > > This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: > > This simple c++ file illustrates the problem: > > enum class CompileCommand { a, b, c }; > void foo(CompileCommand x) {} > char* CompileCommand; // can no longer use "CompileCommand" as a type > void good(enum CompileCommand x) {} > void bad(CompileCommand x) {} > > $ g++ -c ~/enum.cpp > /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void > 5 | void bad(CompileCommand x) {} > > > The fix is to rename the enum type to `CompileCommandEnum`. > > This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18829#pullrequestreview-2007474309 From dlong at openjdk.org Wed Apr 17 23:00:56 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 17 Apr 2024 23:00:56 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. @dougxc should check JVMCI changes. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18819#pullrequestreview-2007498087 From kvn at openjdk.org Thu Apr 18 00:41:03 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 18 Apr 2024 00:41:03 GMT Subject: RFR: 8329433: Reduce nmethod header size [v8] In-Reply-To: References: Message-ID: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1322 bytecoded nmethods for C2: > total in heap = 8307120 (100%) > header = 327856 (3.946687%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Address comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18768/files - new: https://git.openjdk.org/jdk/pull/18768/files/adc17594..5f5f30de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18768&range=06-07 Stats: 16 lines in 1 file changed: 0 ins; 13 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18768.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18768/head:pull/18768 PR: https://git.openjdk.org/jdk/pull/18768 From kvn at openjdk.org Thu Apr 18 00:41:05 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 18 Apr 2024 00:41:05 GMT Subject: RFR: 8329433: Reduce nmethod header size [v7] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 22:23:47 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge master > - remove trailing space > - Shuffle fields initialization > - Address comments. Used checked_cast. > - Use 16-bits types for header_size and frame_complete_offset arguments > - Union fields which usages do not overlap > - Moved some fields initialization into init_defaults() > - 8329433: Reduce nmethod header size Merge [#18637](https://github.com/openjdk/jdk/pull/18637) added an other `short` field `_num_stack_arg_slots` which pushed `nmethod` size back to 240 bytes in product VM. I will not do changes in **this** PR to compensate it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18768#issuecomment-2062779812 From kvn at openjdk.org Thu Apr 18 00:43:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 18 Apr 2024 00:43:00 GMT Subject: RFR: 8329433: Reduce nmethod header size [v7] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 22:23:47 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge master > - remove trailing space > - Shuffle fields initialization > - Address comments. Used checked_cast. > - Use 16-bits types for header_size and frame_complete_offset arguments > - Union fields which usages do not overlap > - Moved some fields initialization into init_defaults() > - 8329433: Reduce nmethod header size I remove `ASSERT` blocks to address the last @dean-long comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18768#issuecomment-2062784402 From dlong at openjdk.org Thu Apr 18 01:16:59 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 18 Apr 2024 01:16:59 GMT Subject: RFR: 8329433: Reduce nmethod header size [v8] In-Reply-To: References: Message-ID: <1HEEQR0o6XO22qTFgrqriq89NEmYyxw3khht6PWlu8U=.245c3378-cdcd-4085-a307-aab5186e6d6d@github.com> On Thu, 18 Apr 2024 00:41:03 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comment Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18768#pullrequestreview-2007632424 From jinguojie.jgj at alibaba-inc.com Thu Apr 18 02:29:40 2024 From: jinguojie.jgj at alibaba-inc.com (Jin Guojie) Date: Thu, 18 Apr 2024 10:29:40 +0800 Subject: =?UTF-8?B?UmU6IEFhcmNoNjQ6IENQVV9Nb2RlbCBzdXBwb3J0IGZvciBOZW92ZXJzZSBOMS9OMi9WMS9W?= =?UTF-8?B?Mg==?= In-Reply-To: <45ADE631-EFCF-4319-94B6-130E324E5907@amazon.co.uk> References: <45ADE631-EFCF-4319-94B6-130E324E5907@amazon.co.uk> Message-ID: <56ae2a53-02d3-4c29-8b13-37172654b5a1.jinguojie.jgj@alibaba-inc.com> Hi Andrew, We wrote a patch to improve the definition of CPU models for Arm Neoverse. Evgeny thinks it?s better to continue the review process. I submitted my OCA application 10 days ago, but it is still under review. Could you please create an issue in the JDK Bug System (JBS), so that I can submit this PR after the OCA is signed? Jin Guojie ?Alibaba?hotspot developer) 2024/4/18 02:32. Astigeevich, Evgeny wrote: > I agree using enums will improve readability. > It's not been done to simplify backporting. > Could you please create a JBS issue and submit a PR? > Evgeny > On 12/04/2024, 09:22, "Jin Guojie" > wrote: > Hi Evgeny, > Thanks for your great work in "8321025: Enable Neoverse N1 optimizations for Neoverse V2?. > I am currently optimizing the Aarch64 branch of hotspot. I found that there are also some constant numbers in this file vm_version_aarch64.cpp. > In order to make the programming style better, wouldn't it be better if we define these constants as macros? > Below is the code patch I wrote. Thank you for your opinion. > > Jin Guojie > > > From 2dd99c9851b0efbb3c9a8bdc95973f4646ad77c2 Mon Sep 17 00:00:00 2001From: Jin Guojie > > Date: Tue, 2 Apr 2024 09:06:04 +0800 > Subject: CPU_Model support for Neoverse N1/N2/V1/V2 > --- > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp | 12 +++--------- > src/hotspot/cpu/aarch64/vm_version_aarch64.hpp | 7 +++++++ > 2 files changed, 10 insertions(+), 9 deletions(-) > diff --git a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp > index 18f310c746c..732020a420f 100644 > --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp > +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp > @@ -213,12 +213,8 @@ void VM_Version::initialize() { > } > > > // Neoverse > - // N1: 0xd0c > - // N2: 0xd49 > - // V1: 0xd40 > - // V2: 0xd4f > - if (_cpu == CPU_ARM && (model_is(0xd0c) || model_is(0xd49) || > - model_is(0xd40) || model_is(0xd4f))) { > + if (_cpu == CPU_ARM && (model_is(CPU_MODEL_NEOVERSE_N1) || model_is(CPU_MODEL_NEOVERSE_N2) || > + model_is(CPU_MODEL_NEOVERSE_V1) || model_is(CPU_MODEL_NEOVERSE_V2))) { > if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { > FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); > } > @@ -248,9 +244,7 @@ void VM_Version::initialize() { > } > > // Neoverse > - // V1: 0xd40 > - // V2: 0xd4f > - if (_cpu == CPU_ARM && (model_is(0xd40) || model_is(0xd4f))) { > + if (_cpu == CPU_ARM && (model_is(CPU_MODEL_NEOVERSE_V1) || model_is(CPU_MODEL_NEOVERSE_V2))) { > if (FLAG_IS_DEFAULT(UseCryptoPmullForCRC32)) { > FLAG_SET_DEFAULT(UseCryptoPmullForCRC32, true); > } > diff --git a/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp b/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp > index 6883dc0d93e..a9821ea50c4 100644 > --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp > +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp > @@ -114,6 +114,13 @@ enum Ampere_CPU_Model { > CPU_MODEL_AMPERE_1B = 0xac5 /* AMPERE_1B core Implements ARMv8.7 with CSSC, MTE, SM3/SM4 extensions */ > }; > > +enum Neoverse_CPU_Model { > + CPU_MODEL_NEOVERSE_N1 = 0xd0c, > + CPU_MODEL_NEOVERSE_N2 = 0xd49, > + CPU_MODEL_NEOVERSE_V1 = 0xd40, > + CPU_MODEL_NEOVERSE_V2 = 0xd4f, > +}; > + > #define CPU_FEATURE_FLAGS(decl) \ > decl(FP, fp, 0) \ > decl(ASIMD, asimd, 1) \ > -- > 2.39.3 Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom. From fyang at openjdk.org Thu Apr 18 02:42:06 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 18 Apr 2024 02:42:06 GMT Subject: RFR: 8330094: RISC-V: Save and restore FRM in the call stub [v4] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 07:08:27 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. >> Thanks! >> >> Tests running ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix comment; minor refinement Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18758#pullrequestreview-2007722718 From fyang at openjdk.org Thu Apr 18 02:43:00 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 18 Apr 2024 02:43:00 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v3] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: On Wed, 17 Apr 2024 07:14:10 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. >> Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 >> >> Thanks > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > minor refinement Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18785#pullrequestreview-2007723758 From jinguojie.jgj at alibaba-inc.com Thu Apr 18 02:43:32 2024 From: jinguojie.jgj at alibaba-inc.com (Jin Guojie) Date: Thu, 18 Apr 2024 10:43:32 +0800 Subject: =?UTF-8?B?UmXvvJpBYXJjaDY0OiBvcHRpbWF0aW9uIGZvciBkb2luZyByZW1haW5kZXIgb24gQUFyY2g2?= =?UTF-8?B?NA==?= In-Reply-To: References: Message-ID: <4d4d046f-cee5-4427-bcc4-3318dd687599.jinguojie.jgj@alibaba-inc.com> On 2024/4/ 23:42 Andrew Haley wrote: > If you can get a Github account and an OpenJDK account we can start to do that. > The first thing for you to do is clone the OpenJDK repo into your own tree, > then create a local branch, then create a PR. > See the section https://openjdk.org/guide/#i-have-a-patch-what-do-i-do According to this guide, a sponsor needs to first create an issue on JBS before submitting a PR. Could you please create an issue in the JDK Bug System (JBS)? I have submitted an OpenJDK account application, but But Oracle has not approved it yet. I will submit this PR after my OCA is signed and the the issure in JBS is created. Thanks very much. Jin Guojie (Alibaba, hotspot developer). diff --git a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp index af744f39fef..39e91ea3bdb 100644 --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp @@ -2075,16 +2075,7 @@ int MacroAssembler::corrected_idivl(Register result, Register ra, Register rb, sdivw(result, ra, rb); } else { sdivw(scratch, ra, rb); - Assembler::msubw(result, scratch, rb, ra); + msubw(result, scratch, rb, ra); } return idivl_offset; @@ -2114,16 +2105,7 @@ int MacroAssembler::corrected_idivq(Register result, Register ra, Register rb, sdiv(result, ra, rb); } else { sdiv(scratch, ra, rb); - Assembler::msub(result, scratch, rb, ra); + msub(result, scratch, rb, ra); } return idivq_offset; diff --git a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp index dad7ec4d497..7266b5d92b0 100644 --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp @@ -437,11 +437,39 @@ class MacroAssembler: public Assembler { Assembler::INSN(Rd, Rn, Rm, Ra); \ } - WRAP(madd) WRAP(msub) WRAP(maddw) WRAP(msubw) + WRAP(madd) WRAP(maddw) WRAP(smaddl) WRAP(smsubl) WRAP(umaddl) WRAP(umsubl) #undef WRAP + inline void msub(Register Rd, Register Rn, Register Rm, Register Ra) { + if (VM_Version::supports_a53mac() && Ra != zr) + nop(); + if (VM_Version::model_is(VM_Version::CPU_MODEL_NEOVERSE_N1) + || VM_Version::model_is(VM_Version::CPU_MODEL_NEOVERSE_N2)) { + /* On Neoverse N series, MSUB uses the same ALU with SDIV. + * The combination of MUL/SUB can utilize multiple ALUS, + * and is much faster than MSUB. */ + mul(rscratch1, Rn, Rm); + sub(Rd, Ra, rscratch1); + } else { + Assembler::msub(Rd, Rn, Rm, Ra); + } + } + inline void msubw(Register Rd, Register Rn, Register Rm, Register Ra) { + if (VM_Version::supports_a53mac() && Ra != zr) + nop(); + if (VM_Version::model_is(VM_Version::CPU_MODEL_NEOVERSE_N1) + || VM_Version::model_is(VM_Version::CPU_MODEL_NEOVERSE_N2)) { + /* On Neoverse N series, MSUB uses the same ALU with SDIV. + * The combination of MUL/SUB can utilize multiple ALUS, + * and is much faster than MSUB. */ + mulw(rscratch1, Rn, Rm); + subw(Rd, Ra, rscratch1); + } else { + Assembler::msubw(Rd, Rn, Rm, Ra); + } + } From iklam at openjdk.org Thu Apr 18 03:56:17 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 18 Apr 2024 03:56:17 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot Message-ID: (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 Common problems: - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. - There's ad-hoc code that deals with `FILE*` differently than from memory. This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) Credit: The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). ------------- Commit messages: - removed more unused code from istream.hpp - Merged ClassFileParser changes from https://github.com/openjdk/jdk/pull/18669 - Removed gtest cases for features removed in the previous commit - Reverted xmlstream.cpp/hpp and removed unused functions from inputStream - fixed builds - Imported @jrose00 changes https://github.com/openjdk/jdk/pull/18773 Changes: https://git.openjdk.org/jdk/pull/18833/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18833&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330532 Stats: 1447 lines in 10 files changed: 1293 ins; 76 del; 78 mod Patch: https://git.openjdk.org/jdk/pull/18833.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18833/head:pull/18833 PR: https://git.openjdk.org/jdk/pull/18833 From iklam at openjdk.org Thu Apr 18 03:56:58 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 18 Apr 2024 03:56:58 GMT Subject: RFR: 8329728: Read long lines in ClassListParser [v5] In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 17:54:25 GMT, Ioi Lam wrote: >> Today the `ClassListParser` has a hard-coded limit of 4096 chars for each line in the CDS class list file. However, it's possible for a line to be much longer than than (64KB for the class name, plus extra information that can include path names, IDs, etc). >> >> I wrote a utility class `LineReader` that automatically allocates a buffer before calling `fgets()`. Hopefully this can be useful for other cases where we call `fgets()` with a fixed buffer size. >> >> Max line width is limited to 4M to simplify testing (and avoid running into corner cases when we approach INT_MAX). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8329728-read-arbitrary-long-lines-in-class-list-parser > - @dholmes-ora and @calvinccheung comments > - Check class name for valid UTF8 encoding > - @matias9927 and @calvinccheung comments - limit line to 4M. Added gtest cases. Test for class names > 64K > - 8329728: Read arbitrarily long lines in ClassListParser > _Mailing list message from [ioi.lam at oracle.com](mailto:ioi.lam at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.org):_ > > Hi John, > > Thanks for posting the code. Let me try to rebase your code onto mainline, and then apply my ClassListParser changes on top of that. I will probably open a new PR that combines this PR (#18669) and yours (#18773). Let see how that looks and we can decide how to proceed. I have integrated John's code and created an alternative PR: https://github.com/openjdk/jdk/pull/18833 . I think it's a much better foundation for going forward and cleaning up the technical debt in HotSpot. Please take a look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18669#issuecomment-2062942771 From kbarrett at openjdk.org Thu Apr 18 04:29:02 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 18 Apr 2024 04:29:02 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Wed, 17 Apr 2024 20:49:37 GMT, Magnus Ihse Bursie wrote: >> I'm aware of this discussion and looking into the issues, but a personal matter has intervened and it will take >> me a while to respond properly. Maybe next week. > > I opened https://bugs.openjdk.org/browse/JDK-8330539 so we don't lose track of this, but we can keep the discussion/voting here. For the impatient, I suggest adopting mechanism 2, i.e. unconditionally include in globalDefinitions_gcc.hpp. We can't include in shared code, and there is a use in shared code (in the relatively recently added JavaThread::pretouch_stack). When I questioned whether we needed to include at all, I referred to a Linux man page I'd found on the internet (the same page mdoerr linked to), which says (in part) "By default, modern compilers automatically translate all uses of alloca() into the built-in ..." Apparently I should have kept digging, because it seems that page is old/incorrect. A seemingly more recent Linux man page describes a different way of handling it that is closer to what we're seeing, but still not quite correct. glibc's includes if __USE_MISC is defined. One of the ways __USE_MISC can become defined is if _GNU_SOURCE is defined, and we define that for both gcc and clang toolchains. We include in globalDefinitions_gcc.hpp. So when building with gcc, globalDefinitions.hpp implicitly includes . The glibc definition of alloca is #ifdef __GNUC__ # define alloca(size) __builtin_alloca (size) #endif /* GCC. */ So that explains why we don't need any explicit include of when building with gcc. I expect there's something similar going on with Visual Studio and Xcode/clang. But apparently not with Open XLC clang. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1569930781 From aboldtch at openjdk.org Thu Apr 18 05:31:00 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 18 Apr 2024 05:31:00 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 16:58:56 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Handle previous bc being monitorenter >> - Remove implicit conditions > > src/hotspot/share/runtime/deoptimization.cpp line 451: > >> 449: chunk->first()->raw_bci() == SynchronizationEntryBCI; >> 450: // If deoptimizing from monitorenter bytecode we maybe in transitional state. Skip verification. >> 451: // When reexecuting the current bc, the previous bc may not have finished yet. > > Should this: > `... the previous bc may not have finished yet.` > be: > `... the previous monitorenter bc may not have finished yet.` Yeah, while both statements are true, the later is more specific to this context. Which is probably better and more clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1570002854 From aboldtch at openjdk.org Thu Apr 18 05:38:24 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 18 Apr 2024 05:38:24 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: References: Message-ID: <6iJeqGCmcJpeoNq0w2XZ-uJwzmTrMGLDKZd_1JLkrfc=.91b9c9ae-d03d-4179-9e88-0ab2e409904a@github.com> > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Running JVMTI reproducer. Starting full testing soon. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Whitespace - Spelling and typos ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/03a4e045..bf49a93f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=01-02 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From aboldtch at openjdk.org Thu Apr 18 05:38:24 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 18 Apr 2024 05:38:24 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 16:56:46 GMT, Daniel D. Daugherty wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Whitespace >> - Spelling and typos > > src/hotspot/share/runtime/deoptimization.cpp line 443: > >> 441: } >> 442: #ifdef ASSERT >> 443: if (LockingMode == LM_LIGHTWEIGHT && !realloc_failures) { > > In the new code, you are no longer account for `realloc_failures` being true. > I'm not convinced that is okay here. Originally I had those conditions in as well (to make it more clear and explicit). But Vladimir thought them superfluous. They are implicitly true from `lock_order.is_nonempty() -> LockingMode == LM_LIGHTWEIGHT && !realloc_failures` because we only ever add elements to `lock_order` if `LockingMode == LM_LIGHTWEIGHT && !realloc_failures` is true. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1570006419 From amitkumar at openjdk.org Thu Apr 18 06:10:57 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 18 Apr 2024 06:10:57 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" I have done testing with `DiagnoseSyncOnValueBasedClasses=1`. This test failure I got, but it's failing on master branch as well if we are using same argument: STDOUT: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/amit/mr/jdk/src/hotspot/share/runtime/synchronizer.cpp:485), pid=3628970, tid=3628992 # fatal error: Synchronizing on object 0x00000000ffe583c8 of klass java.lang.Integer at SplitIfSharedFastLockBehindCastPP.test2(SplitIfSharedFastLockBehindCastPP.java:92) # # JRE version: OpenJDK Runtime Environment (23.0) (fastdebug build 23-internal-adhoc.amit.jdk) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.amit.jdk, mixed mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-s390x) # Problematic frame: # V [libjvm.so+0x12b4b8c] ObjectSynchronizer::handle_sync_on_value_based_class(Handle, JavaThread*)+0x944 # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/amit/mr/jdk/build/linux-s390x-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_loopopts_SplitIfSharedFastLockBehindCastPP_java/scratch/0/core.3628970) # # An error report file with more information is saved as: # /home/amit/mr/jdk/build/linux-s390x-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_loopopts_SplitIfSharedFastLockBehindCastPP_java/scratch/0/hs_err_pid3628970.log # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2063072273 From aph-open at littlepinkcloud.com Thu Apr 18 07:20:56 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Thu, 18 Apr 2024 08:20:56 +0100 Subject: Aarch64: CPU_Model support for Neoverse N1/N2/V1/V2 In-Reply-To: <56ae2a53-02d3-4c29-8b13-37172654b5a1.jinguojie.jgj@alibaba-inc.com> References: <45ADE631-EFCF-4319-94B6-130E324E5907@amazon.co.uk> <56ae2a53-02d3-4c29-8b13-37172654b5a1.jinguojie.jgj@alibaba-inc.com> Message-ID: On 4/18/24 03:29, Jin Guojie wrote: > We wrote a patch to improve the definition of CPU models for Arm Neoverse. > Evgeny thinks it?s better to continue the review process. Sure. My immediate reaction is that having separate categories for the Neoverse CPUs is getting to be rather cumbersome. Clearly they have a lot in common, and it would be nicer to be able to say things like "if CPU is Arm.Neoverse" or "is Arm.Neoverse.V2" but right now I can't think of a nice way to do that. Maybe a nested class hierarchy? > I submitted my OCA application 10 days ago, but it is still under review. > Could you please create an issue in the JDK Bug System (JBS), > so that I can submit this PR after the OCA is signed? I will, but let's have some ideas about what the result should be. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From gdub at openjdk.org Thu Apr 18 07:53:58 2024 From: gdub at openjdk.org (Gilles Duboscq) Date: Thu, 18 Apr 2024 07:53:58 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: <1XlY7bXwXvhA6OwpvJtYLwKcnEmoNRB7aG1JO1ArEMs=.b32f15a7-296f-4145-9533-fa3ebd6c9aa2@github.com> On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotConstantPool.java line 720: > 718: @Override > 719: public JavaMethod lookupMethod(int rawIndex, int opcode, ResolvedJavaMethod caller) { > 720: int which = rawIndex; We could get rid of that intermediate variable now and just use `rawIndex` below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18819#discussion_r1570192972 From mli at openjdk.org Thu Apr 18 07:55:58 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 07:55:58 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v3] In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 10:08:59 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Non inclusive positive side Looks good. Just one minor suggestion, the name *in_range_auipc_`s12`* seems could be improved a bit, maybe something like `is_valid_auipc_offset` could be better? ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18755#pullrequestreview-2008121357 From luhenry at openjdk.org Thu Apr 18 08:02:58 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 18 Apr 2024 08:02:58 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v3] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: On Wed, 17 Apr 2024 07:14:10 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. >> Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 >> >> Thanks > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > minor refinement Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18785#pullrequestreview-2008137032 From luhenry at openjdk.org Thu Apr 18 08:06:09 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 18 Apr 2024 08:06:09 GMT Subject: RFR: 8330094: RISC-V: Save and restore FRM in the call stub [v4] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 07:08:27 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. >> Thanks! >> >> Tests running ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix comment; minor refinement Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18758#pullrequestreview-2008143678 From tschatzl at openjdk.org Thu Apr 18 08:14:00 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 18 Apr 2024 08:14:00 GMT Subject: RFR: 8330475: Remove unused default value for ModRefBarrierSet::write_ref_array_pre In-Reply-To: References: Message-ID: <1f-Kqu42mqeJQOPePceb3hrm5aqIevphwSnVVGKRs3U=.4df28f8a-484e-43b5-b73c-2e66fa5bf120@github.com> On Wed, 17 Apr 2024 09:46:54 GMT, Albert Mingkun Yang wrote: > Trivial removing unnecessary code. trivial ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18812#pullrequestreview-2008160690 From dnsimon at openjdk.org Thu Apr 18 08:26:58 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 18 Apr 2024 08:26:58 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 22:58:21 GMT, Dean Long wrote: > @dougxc should check JVMCI changes. Thanks for the heads up. I've asked @matias9927 to double check these changes against libgraal which should address any concerns about how this change impacts Graal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18819#issuecomment-2063308834 From ayang at openjdk.org Thu Apr 18 08:28:13 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 18 Apr 2024 08:28:13 GMT Subject: RFR: 8330475: Remove unused default value for ModRefBarrierSet::write_ref_array_pre In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 09:46:54 GMT, Albert Mingkun Yang wrote: > Trivial removing unnecessary code. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18812#issuecomment-2063310062 From ayang at openjdk.org Thu Apr 18 08:28:13 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 18 Apr 2024 08:28:13 GMT Subject: Integrated: 8330475: Remove unused default value for ModRefBarrierSet::write_ref_array_pre In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 09:46:54 GMT, Albert Mingkun Yang wrote: > Trivial removing unnecessary code. This pull request has now been integrated. Changeset: 5eb2c596 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/5eb2c596e2ca38025dfb9f8e37703036d0bcda19 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8330475: Remove unused default value for ModRefBarrierSet::write_ref_array_pre Reviewed-by: gli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18812 From azafari at openjdk.org Thu Apr 18 08:42:10 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 08:42:10 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Wed, 17 Apr 2024 12:02:50 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/os/windows/os_windows.cpp line 5110: > >> 5108: >> 5109: // Record virtual memory allocation >> 5110: MemTracker::record_virtual_memory_reserve_and_commit((address)addr, bytes, CALLER_PC, flag); > > Should this really be called here? The posix version don't call this, so I don't understand why it is called here in the Windows code. It uses the requested address rather than the result of allocation that seems a bug. Removed. > src/hotspot/share/cds/filemap.cpp line 1697: > >> 1695: static char* map_memory(int fd, const char* file_name, size_t file_offset, >> 1696: char *addr, size_t bytes, bool read_only, >> 1697: bool allow_exec, MEMFLAGS flags) { > > It is odd that `map_memory` and `os::map_memory` has different parameter order. I understand that this is done because of default values, but I'd like to suggest that you get rid of these default values and fix the order. > > (Side-note: Wouldn't it be better to rename this `map_memory` to something that clearly shows the difference between this function and `os::map_memory`) `map_memory` is renamed to `map_and_pretouch_memory`. argument orders match with `os::map_memory`. > src/hotspot/share/classfile/compactHashtable.cpp line 243: > >> 241: quit("Unable to open hashtable dump file", filename); >> 242: } >> 243: _base = os::map_memory(_fd, filename, 0, nullptr, _size, mtInternal, true, false); > > Isn't this CDS code. Should ths be mtClassShared or something else that indicates that this is CDS code? Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570270206 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570272820 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570273489 From azafari at openjdk.org Thu Apr 18 08:59:59 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 08:59:59 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Wed, 17 Apr 2024 12:57:43 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/share/runtime/os.cpp line 1817: > >> 1815: >> 1816: char* os::reserve_memory(size_t bytes, bool executable, MEMFLAGS flags) { >> 1817: char* result = pd_reserve_memory(bytes, executable, flags); > > Doesn't it look weird that we pass in flags here and then still call MemTracker::record_ below? I think this is an artifact from mixing if we put the NMT calls in shared or in platform dependent code. I understand that you need this for this patch, but I also think there needs to be some RFE to figure out if this can be reworked. The flag is needed on Windows for this call hierarchy: reserve_memory pd_reserve_memory pd_attempt_reserve_memory_at allocate_pages_individually(..., MEMFLAGS flag) Other platforms ignore the flag. Agreed on a new RFE for handling this. > src/hotspot/share/runtime/os.cpp line 2187: > >> 2185: MEMFLAGS flags, >> 2186: bool read_only, >> 2187: bool allow_exec) { > > The function was written with multiple parameters per line here, and then you changed it so that some of the params where placed on individual lines. This should likely be reverted. Fixed. > src/hotspot/share/runtime/os.hpp line 233: > >> 231: char *addr, size_t bytes, >> 232: MEMFLAGS flag, >> 233: bool read_only = false, > > Mixes param layout style. (Plus earlier comment that the default values should probably be removed so that MEMFLAGS can be put last). Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570302144 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570305926 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570306322 From azafari at openjdk.org Thu Apr 18 09:03:00 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 09:03:00 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Wed, 17 Apr 2024 11:59:01 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/share/runtime/os.hpp line 471: > >> 469: // vm_exit_out_of_memory() with the specified mesg. >> 470: static void commit_memory_or_exit(char* addr, size_t bytes, >> 471: bool executable, const char* mesg, MEMFLAGS flag); > > I think that we should change the parameter order here, so that it is like `commit_memory` and then the extra mesg param goes with the `_or_exit` part (if that makes sense). > Suggestion: > > bool executable, MEMFLAGS flag, const char* mesg); Changed the order. Note that, now it is only the `commit_memory_or_exit` in `os.hpp` that the MEMFLAGS is not the last param. > src/hotspot/share/runtime/os.hpp line 522: > >> 520: MEMFLAGS flag, >> 521: bool read_only = false, >> 522: bool allow_exec = false); > > params layout style. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570311465 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570311935 From azafari at openjdk.org Thu Apr 18 09:22:34 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 09:22:34 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v8] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: order of params are consistent now. style is corrected. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/abcfcccd..229bc890 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=06-07 Stats: 49 lines in 14 files changed: 2 ins; 10 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Thu Apr 18 09:22:34 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 09:22:34 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v8] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 18 Apr 2024 09:19:08 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > order of params are consistent now. style is corrected. Comments applied. ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2008324970 From azafari at openjdk.org Thu Apr 18 09:34:40 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 09:34:40 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v9] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <2q_nxOZzJ8y0s4otI6yWw90csbVGgSNaWyswAjlxly0=.644c8479-0061-4e0d-982b-e97fe7b802a2@github.com> > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: fixed a missed file from param reordering. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/229bc890..769166f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=07-08 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From stefank at openjdk.org Thu Apr 18 09:41:03 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 18 Apr 2024 09:41:03 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v8] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 18 Apr 2024 09:22:34 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > order of params are consistent now. style is corrected. More comments. src/hotspot/os/windows/os_windows.cpp line 5068: > 5066: char* os::pd_map_memory(int fd, const char* file_name, size_t file_offset, > 5067: char *addr, size_t bytes, > 5068: bool read_only, This should be reverted. src/hotspot/share/cds/filemap.cpp line 1701: > 1699: AlwaysPreTouch ? false : read_only, > 1700: allow_exec, > 1701: flags); Revert this change src/hotspot/share/cds/filemap.cpp line 1729: > 1727: false /* !read_only */, > 1728: r->allow_exec(), > 1729: mtClassShared); This mixes styles between multiple arguments per line vs one argument per line. src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 217: > 215: if (!_heap_region_special) { > 216: os::commit_memory_or_exit(sh_rs.base(), _initial_size, heap_alignment, !ExecMem, > 217: "Cannot commit heap memory", mtGC); The argument order needs to be changed here and below. src/hotspot/share/runtime/os.cpp line 2185: > 2183: char* os::map_memory(int fd, const char* file_name, size_t file_offset, > 2184: char *addr, size_t bytes, > 2185: bool read_only, bool allow_exec, MEMFLAGS flags) { You should probably move back read_only to the line below. src/hotspot/share/runtime/os.hpp line 520: > 518: bool read_only, > 519: bool allow_exec, > 520: MEMFLAGS flag); Style inconsistency ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2008344764 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570357660 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570358795 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570360021 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570361901 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570371965 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570373261 From stefank at openjdk.org Thu Apr 18 09:41:04 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 18 Apr 2024 09:41:04 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Thu, 18 Apr 2024 08:39:25 GMT, Afshin Zafari wrote: >> src/hotspot/share/cds/filemap.cpp line 1697: >> >>> 1695: static char* map_memory(int fd, const char* file_name, size_t file_offset, >>> 1696: char *addr, size_t bytes, bool read_only, >>> 1697: bool allow_exec, MEMFLAGS flags) { >> >> It is odd that `map_memory` and `os::map_memory` has different parameter order. I understand that this is done because of default values, but I'd like to suggest that you get rid of these default values and fix the order. >> >> (Side-note: Wouldn't it be better to rename this `map_memory` to something that clearly shows the difference between this function and `os::map_memory`) > > `map_memory` is renamed to `map_and_pretouch_memory`. > argument orders match with `os::map_memory`. Thanks. The parames need to be adjusted after the rename. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570358350 From stefank at openjdk.org Thu Apr 18 09:41:05 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 18 Apr 2024 09:41:05 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Mon, 15 Apr 2024 16:02:01 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/virtualspace.hpp line 199: >> >>> 197: size_t _upper_alignment; >>> 198: >>> 199: MEMFLAGS _nmt_flag; >> >> The VirtualSpace::initialize functions used to initialize these members in the order that they are specified here. That is now messed up by adding the _nmt_flag at the end here, but in the beginning in the initialize function. I would propose that you move it to after _executable, both here and in the initialize function. > > Fixed. I'd like to see some consistency between ReservedHeap and VritualSpace. Could you change the variables layout and intialization to be: _special _executable _nmt_flags in both classes? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570369399 From sgehwolf at openjdk.org Thu Apr 18 09:54:56 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 18 Apr 2024 09:54:56 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 01:07:04 GMT, Jan Kratochvil wrote: >>> IMHO `is_containerized()` is OK to return `false` even when running in a container but with no limitations set. >> >> The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used. > >> The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. > > The in-container tuning means to use all the available resources. Containers in the real world have some memory limits set which is where my modified patch still correctly identifies it as a container to use all the available resources of the node which is the whole goal of the container detection code. > >> In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. > > I expect it should have been written "We need a reliable way to distinguish real world in-container vs. non-containerized setup. We do not mind behavior for artificial containers on OpenJDK development machines.". Which is what my patch does in an easier and less error-prone way. > >> If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used. > > That behaves still the same with my patch. > > Could you give a countercase where my patch behaves wrongly? @jankratochvil I believe this boils down to what we actually want. Should `OSContainer::is_containerized()` return `false` when run *inside* a container? If so, when is it OK to do that? Should `OSContainer::is_containerized()` return `true` on a physical Linux deployment? IMO, the read-only property of the mount points was something that fit naturally since, we already scan those anyway for (cgv1 vs cgv2 detection). Therefore it was something to consider to make heuristics more accurate. The truth table of the patch in this PR looks like this: | `OSContainer::is_containerized()` value | Actual deployment scenario | | ------------- | ------------- | | `true` | OpenJDK runs in an unprivileged container **without** a cpu/memory limit | | `true` | OpenJDK runs in an unprivileged container **with** a cpu/memory limit | | `true` | OpenJDK runs in a privileged container **with** a cpu/memory limit | | `false` | OpenJDK runs in a privileged container **without** a cpu/memory limit | | `false` | OpenJDK runs in a systemd slice **without** a cpu/memory limit | | `true` | OpenJDK runs in a systemd slice **with** a cpu/memory limit | | `false` | OpenJDK runs on a physical Linux system (VM or bare metal) | As you can see, the case of "OpenJDK runs in a privileged container *without* a cpu/memory limit" gives the wrong result. However, I consider this a fairly uncommon setup and there isn't really anything we can do to detect this kind of setup. Even if we did manage to detect it, why would we care? It's a question of probability. > Could you give a countercase where my patch behaves wrongly? Again, it's not a case of right or wrong IMO. Since we are in the land of heuristics, they will be wrong in some cases. We should make them so that we cover the common cases and, perhaps, are able to use those in serviceability tools to help guide diagnosis and/or further tuning. So far the existing capabilities were OK, but prevent further out-of-the-box tuning and/or accurate data collection. Your proposed patch (it's one I had in a previous iteration too, fwiw) would also return `false` for the case of "OpenJDK runs in an unprivileged container **without** a cpu/memory limit", which seems counterintuitive if OpenJDK actually runs in a container! What's more, it seems a fairly common case. Also, there is a chance of the OpenJDK heuristics to fail memory/cpu limit detection because of bugs and new developments. It seems the safer option to know that OpenJDK is containerized (using other heuristics) in that case. Maybe that's just me. Let's have that discussion more broadly and see if we can reach consensus! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2063477204 From rehn at openjdk.org Thu Apr 18 10:52:20 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 18 Apr 2024 10:52:20 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v4] In-Reply-To: References: Message-ID: > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Rename ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18755/files - new: https://git.openjdk.org/jdk/pull/18755/files/80d9088d..b331b8af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=02-03 Stats: 6 lines in 2 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18755.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18755/head:pull/18755 PR: https://git.openjdk.org/jdk/pull/18755 From mli at openjdk.org Thu Apr 18 11:15:57 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 11:15:57 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI [v3] In-Reply-To: References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: On Thu, 18 Apr 2024 08:00:11 GMT, Ludovic Henry wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> minor refinement > > Marked as reviewed by luhenry (Committer). Thanks @luhenry @VladimirKempik @RealFYang for your reviewing! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18785#issuecomment-2063618280 From mli at openjdk.org Thu Apr 18 11:16:02 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 11:16:02 GMT Subject: RFR: 8330094: RISC-V: Save and restore FRM in the call stub [v4] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 08:03:15 GMT, Ludovic Henry wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix comment; minor refinement > > Marked as reviewed by luhenry (Committer). Thanks @luhenry @RealFYang for your reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18758#issuecomment-2063616448 From mli at openjdk.org Thu Apr 18 11:16:03 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 11:16:03 GMT Subject: Integrated: 8330094: RISC-V: Save and restore FRM in the call stub In-Reply-To: References: Message-ID: <4rXVTV4adGIcleMsegWxhAcl7uJfqAsgoRHi5Anpk3Y=.54f5854a-04c8-49d6-8056-c3e5457b7f41@github.com> On Fri, 12 Apr 2024 12:16:44 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > As discussed at https://github.com/openjdk/jdk/pull/17745#discussion_r1558783467, we should do the similar thing as [JDK-8319973](https://bugs.openjdk.org/browse/JDK-8319973) on aarch64. > Thanks! > > Tests running ... This pull request has now been integrated. Changeset: b0496096 Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/b0496096dc8d7dc7acf28aa006141a3ecea446de Stats: 34 lines in 3 files changed: 21 ins; 9 del; 4 mod 8330094: RISC-V: Save and restore FRM in the call stub Reviewed-by: fyang, luhenry ------------- PR: https://git.openjdk.org/jdk/pull/18758 From coleenp at openjdk.org Thu Apr 18 11:22:00 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Apr 2024 11:22:00 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. To borrow @shipilev's comment from a different PR, Good Riddance! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18819#pullrequestreview-2008602360 From mli at openjdk.org Thu Apr 18 11:23:06 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 11:23:06 GMT Subject: Withdrawn: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> References: <6e_QPv6LVVN19HkQrQ2DyB_sXxqGqgwnclI2StdkeaY=.0537b78c-c2c1-4341-a914-f40f7117d72e@github.com> Message-ID: On Mon, 15 Apr 2024 15:37:06 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > As discussed at: https://github.com/openjdk/jdk/pull/18758#pullrequestreview-1999982333, we'd better to do it. > Similar thing is done on aarch64, https://bugs.openjdk.org/browse/JDK-8320892 > > Thanks This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/18785 From coleenp at openjdk.org Thu Apr 18 12:11:57 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Apr 2024 12:11:57 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 03:51:06 GMT, Ioi Lam wrote: > (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) > > HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: > > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 > > Common problems: > - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. > - There's ad-hoc code that deals with `FILE*` differently than from memory. > > This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. > > The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. > > In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) > > Credit: > The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . > > John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). Just drive-by comments. src/hotspot/share/cds/classListParser.cpp line 436: > 434: } > 435: > 436: void ClassListParser::check_class_name(const char* class_name) { Did we not already have code to check the length of the class name for the class list parser? There's similar code in systemDictionary. src/hotspot/share/utilities/istream.cpp line 2: > 1: /* > 2: * Copyright (c) 2023, 2024, Oracle and/or its affiliates. All rights reserved. These new files should only say 2024 in the copyright. src/hotspot/share/utilities/istream.hpp line 234: > 232: _line_ending = 0; > 233: _input_state = NTR_STATE; > 234: } This should be an initialization list. ------------- PR Review: https://git.openjdk.org/jdk/pull/18833#pullrequestreview-2008679215 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1570587600 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1570581054 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1570585826 From fyang at openjdk.org Thu Apr 18 12:17:02 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 18 Apr 2024 12:17:02 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v4] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 10:52:20 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Rename TBH, I don't really like this new name `is_valid_auipc_offset` which I think could be misleading. Because it's not an offset for solely `auipc`. I prefer names like `is_valid_32bit_offset` or `riscv_insn_valid_32bit_offset` which is similar to the one in linux. But I will let you guys decide :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18755#issuecomment-2063720818 From heidinga at openjdk.org Thu Apr 18 12:44:19 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Thu, 18 Apr 2024 12:44:19 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v4] In-Reply-To: References: Message-ID: > Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. > > `make test-tier1` passed with this change Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: Update copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18823/files - new: https://git.openjdk.org/jdk/pull/18823/files/eb0ba81f..0a34511d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18823/head:pull/18823 PR: https://git.openjdk.org/jdk/pull/18823 From mli at openjdk.org Thu Apr 18 13:19:01 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 13:19:01 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v4] In-Reply-To: References: Message-ID: <29pQfcKbIGi1OUrxqp_QGGRqr_mX3nWscznYF9qYfjk=.b413d73f-8bc0-4b9b-ba6e-14dc3f4961a3@github.com> On Thu, 18 Apr 2024 12:11:42 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename > > TBH, I don't really like this new name `is_valid_auipc_offset` which I think could be misleading. Because it's not an offset for solely `auipc`. I prefer names like `is_valid_32bit_offset` or `riscv_insn_valid_32bit_offset` which is similar to the one in linux. But I will let you guys decide :-) Seems @RealFYang 's suggestion is better, but I'm fine with either of these versions, as long as no `s12`. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18755#issuecomment-2063844960 From jkratochvil at openjdk.org Thu Apr 18 13:30:08 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Thu, 18 Apr 2024 13:30:08 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: <0gUVzigzVfEv1IWV9irog8S3hPme-Aux9fDUWjPO2wc=.fa1648a1-4714-4f74-acc0-22c4250490af@github.com> On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true Could not we rename `is_containerized()` to `use_container_limit()` ? As that is the current only purpose of `is_containerized()`. I did not test it but I expect the values will be: | your patch | my trivial patch | Actual deployment scenario | |--------|--------|--------| | `true` | `false` |OpenJDK runs in an unprivileged container **without** a cpu/memory limit | | `true` | `true` | OpenJDK runs in an unprivileged container **with** a cpu/memory limit | | `false` | `false` | OpenJDK runs in a privileged container **without** a cpu/memory limit | | `true` | `true` | OpenJDK runs in a privileged container **with** a cpu/memory limit | | `false` | `false` | OpenJDK runs in a systemd slice **without** a cpu/memory limit | | `true` | `true` | OpenJDK runs in a systemd slice **with** a cpu/memory limit | | `false` | `false` | OpenJDK runs on a physical Linux system (VM or bare metal) | ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2063868908 From mli at openjdk.org Thu Apr 18 13:32:21 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 18 Apr 2024 13:32:21 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI Message-ID: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Hi, Can you help to review this patch? It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `integrate` the pr, but in fact before it's indeed integrated by github. Sorry for the inconvenience. Thanks ------------- Commit messages: - minor refinement - refine code - fix typo - Initial commit Changes: https://git.openjdk.org/jdk/pull/18839/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18839&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330266 Stats: 25 lines in 5 files changed: 25 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18839.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18839/head:pull/18839 PR: https://git.openjdk.org/jdk/pull/18839 From ayang at openjdk.org Thu Apr 18 13:32:59 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 18 Apr 2024 13:32:59 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v4] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 12:44:19 GMT, Dan Heidinga wrote: >> Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. >> >> `make test-tier1` passed with this change > > Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright src/hotspot/share/oops/instanceKlass.cpp line 1512: > 1510: > 1511: instanceOop InstanceKlass::allocate_instance(TRAPS) { > 1512: bool has_finalizer_flag = has_finalizer(); // Query before possible GC This local var is unused. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18823#discussion_r1570758524 From heidinga at openjdk.org Thu Apr 18 13:49:19 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Thu, 18 Apr 2024 13:49:19 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v5] In-Reply-To: References: Message-ID: > Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. > > `make test-tier1` passed with this change Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: Remove unused variable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18823/files - new: https://git.openjdk.org/jdk/pull/18823/files/0a34511d..741f1250 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18823&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18823/head:pull/18823 PR: https://git.openjdk.org/jdk/pull/18823 From heidinga at openjdk.org Thu Apr 18 13:49:19 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Thu, 18 Apr 2024 13:49:19 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v4] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 13:29:55 GMT, Albert Mingkun Yang wrote: >> Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: >> >> Update copyright > > src/hotspot/share/oops/instanceKlass.cpp line 1512: > >> 1510: >> 1511: instanceOop InstanceKlass::allocate_instance(TRAPS) { >> 1512: bool has_finalizer_flag = has_finalizer(); // Query before possible GC > > This local var is unused. Thanks. Pushed an update ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18823#discussion_r1570791654 From azafari at openjdk.org Thu Apr 18 13:55:06 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 13:55:06 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v8] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 18 Apr 2024 09:36:59 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> order of params are consistent now. style is corrected. > > src/hotspot/share/runtime/os.cpp line 2185: > >> 2183: char* os::map_memory(int fd, const char* file_name, size_t file_offset, >> 2184: char *addr, size_t bytes, >> 2185: bool read_only, bool allow_exec, MEMFLAGS flags) { > > You should probably move back read_only to the line below. move back means line above? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570806464 From tonyp at openjdk.org Thu Apr 18 14:07:07 2024 From: tonyp at openjdk.org (Antonios Printezis) Date: Thu, 18 Apr 2024 14:07:07 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v4] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 10:52:20 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Rename LGTM ------------- Marked as reviewed by tonyp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18755#pullrequestreview-2009008988 From rehn at openjdk.org Thu Apr 18 14:19:29 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 18 Apr 2024 14:19:29 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v5] In-Reply-To: References: Message-ID: > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: - Added comment - Rename ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18755/files - new: https://git.openjdk.org/jdk/pull/18755/files/b331b8af..de72bf9a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18755&range=03-04 Stats: 8 lines in 2 files changed: 2 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18755.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18755/head:pull/18755 PR: https://git.openjdk.org/jdk/pull/18755 From azafari at openjdk.org Thu Apr 18 14:27:37 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 14:27:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: more improvements in style/alignments/adjustments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/769166f8..897b4b30 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=08-09 Stats: 26 lines in 6 files changed: 3 ins; 9 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Thu Apr 18 14:27:38 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Thu, 18 Apr 2024 14:27:38 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v8] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 18 Apr 2024 09:28:05 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> order of params are consistent now. style is corrected. > > src/hotspot/os/windows/os_windows.cpp line 5068: > >> 5066: char* os::pd_map_memory(int fd, const char* file_name, size_t file_offset, >> 5067: char *addr, size_t bytes, >> 5068: bool read_only, > > This should be reverted. Done. > src/hotspot/share/cds/filemap.cpp line 1701: > >> 1699: AlwaysPreTouch ? false : read_only, >> 1700: allow_exec, >> 1701: flags); > > Revert this change Done. > src/hotspot/share/cds/filemap.cpp line 1729: > >> 1727: false /* !read_only */, >> 1728: r->allow_exec(), >> 1729: mtClassShared); > > This mixes styles between multiple arguments per line vs one argument per line. related params (file, memory, info) are in the same line. > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 217: > >> 215: if (!_heap_region_special) { >> 216: os::commit_memory_or_exit(sh_rs.base(), _initial_size, heap_alignment, !ExecMem, >> 217: "Cannot commit heap memory", mtGC); > > The argument order needs to be changed here and below. Done. > src/hotspot/share/runtime/os.hpp line 520: > >> 518: bool read_only, >> 519: bool allow_exec, >> 520: MEMFLAGS flag); > > Style inconsistency related params (file, memory, flags) are in the same line. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570872004 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570871096 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570870825 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570870421 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1570868039 From christian.hagedorn at oracle.com Thu Apr 18 14:31:20 2024 From: christian.hagedorn at oracle.com (Christian Hagedorn) Date: Thu, 18 Apr 2024 16:31:20 +0200 Subject: CFV: New HotSpot Group Member: Andrew Dinn In-Reply-To: References: Message-ID: Vote: yes Best regards, Christian On 11.04.24 15:24, Thomas Stuefe wrote: > Hi, > > I hereby nominate Andrew Dinn (adinn) to Membership in the HotSpot Group. > > Andrew is a well-known and respected member of the OpenJDK community. He has > been a contributor since the early days of OpenJDK.? > > The history of his contributions has been mangled by various SCM moves and repo > consolidations over the years [1], but he was one of the original authors of the > arm64 port ([2] shows 359 changes in the mercurial hotspot sub repository > alone), contributed JEP 352 (support for NVM devices under byte buffers), and > more recently has been active in the Graal and the Leyden projects. > > Votes are due by April 25, 2024. > > Only current Members of the HotSpot Group [3] are eligible to vote on this > nomination.? Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Cheers, Thomas > > [1]?https://github.com/openjdk/jdk/commits/master/?author=adinn > > [2]?https://hg.openjdk.org/aarch64-port/jdk7u/hotspot > > [3]?https://openjdk.org/census#members > [4]?https://openjdk.org/groups/#member-vote > > From dcubed at openjdk.org Thu Apr 18 14:57:03 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 18 Apr 2024 14:57:03 GMT Subject: RFR: 8324776: runtime/os/TestTransparentHugePageUsage.java fails with The usage of THP is not enough In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 08:57:48 GMT, Liming Liu wrote: > This PR remove the testcase introduced in JDK-8315923, as we could not find a reliable way to evaluate the usage of THP. We have tried the following methods: > > 1. transverse /proc/self/smaps rather than looking up the first map covered by the heap, as we found there can be multiple sections in /proc/self/smaps for the heap; (https://github.com/limingliu-ampere/jdk/commit/c5b0c4cdf9fa42988faa9fee6ee004ebb599d40a) > 2. take the mode of de-fragment and the enabling of khugepaged into account rather than just THP mode, as THP may not be available immediately when the de-fragment mode is neither "always" nor "madvise", or khugepaged does not collapse pages; (https://github.com/limingliu-ampere/jdk/commit/9c70e9384325b44e074a9e8973846343b27fd2cc) > 3. call madvise with MADV_HUGEPAGE unconditionally rather than calling it only when THP mode is not "always", and adjust the sizes of young and old generations to ensure the parameters are aligned with THP; (https://github.com/limingliu-ampere/jdk/commit/de9607ff64cc526bca9968b72a7065888c2f944d) > 4. check the changes of system-wide counters like thp_* in /proc/vmstat before and after pretouch via gtest. (https://github.com/limingliu-ampere/jdk/commit/bc83e19a682156ee7d09bf939c2b18f3d8c79e22) > > But none of them helps. The amount of THP keeps zero on Oracle CI, although the THP mode is "always", the de-fragment mode is "madvise" and khugepaged is enabled. Furthermore, none of thp counters changed around pretouch. However, we tried the same kernel (5.15-UEK) as Oracle CI on our machine, and found that these methods do help. Thus, we decided to remove this testcase. Thanks for the reminder about how `pretouch_thp_and_use_concurrent` is run. I'm good with the mechanics of how you are removing the unreliable test. However, I'm not a Linux THP guy so I'm not the right person to mumble about that mechanism or how it works. Looking thru the integration history for JDK23, it looks like @tstuefe and/or @MBaesken are the folks that most often show up when THP changesets are integrated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18792#issuecomment-2064102959 From dcubed at openjdk.org Thu Apr 18 15:01:04 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 18 Apr 2024 15:01:04 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: <6iJeqGCmcJpeoNq0w2XZ-uJwzmTrMGLDKZd_1JLkrfc=.91b9c9ae-d03d-4179-9e88-0ab2e409904a@github.com> References: <6iJeqGCmcJpeoNq0w2XZ-uJwzmTrMGLDKZd_1JLkrfc=.91b9c9ae-d03d-4179-9e88-0ab2e409904a@github.com> Message-ID: <9ZSpg74Yg5oF2NN-0YBFR-Ognpg-QkC-UhTyLteey5E=.ef07b041-f9f1-460a-8c3b-3ef8ac7e3636@github.com> On Thu, 18 Apr 2024 05:38:24 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. >> >> Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. >> >> Running JVMTI reproducer. Starting full testing soon. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Whitespace > - Spelling and typos Thumbs up on the changes themselves. Unless I missed it, I haven't seen updated info on the pre-integration testing in general and specifically about whether Tier8 was executed. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18782#pullrequestreview-2009157370 From dcubed at openjdk.org Thu Apr 18 15:01:05 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 18 Apr 2024 15:01:05 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 05:31:55 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/share/runtime/deoptimization.cpp line 443: >> >>> 441: } >>> 442: #ifdef ASSERT >>> 443: if (LockingMode == LM_LIGHTWEIGHT && !realloc_failures) { >> >> In the new code, you are no longer account for `realloc_failures` being true. >> I'm not convinced that is okay here. > > Originally I had those conditions in as well (to make it more clear and explicit). But Vladimir thought them superfluous. They are implicitly true from `lock_order.is_nonempty() -> LockingMode == LM_LIGHTWEIGHT && !realloc_failures` because we only ever add elements to `lock_order` if `LockingMode == LM_LIGHTWEIGHT && !realloc_failures` is true. Thanks for the clarification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1570927939 From kvn at openjdk.org Thu Apr 18 15:05:11 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 18 Apr 2024 15:05:11 GMT Subject: RFR: 8329433: Reduce nmethod header size [v8] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 00:41:03 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comment Waiting second review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18768#issuecomment-2064136498 From matsaave at openjdk.org Thu Apr 18 15:27:02 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 18 Apr 2024 15:27:02 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 22:43:38 GMT, Dean Long wrote: >> Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. >> >> The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. > > src/hotspot/share/classfile/resolutionErrors.hpp line 60: > >> 58: >> 59: // This function is used to encode an invokedynamic index to differentiate it from a >> 60: // constant pool index. It assumes it is being called with a index that is less than 0 > > Is this comment still correct? The last sentence is no longer valid, thanks for catching this! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18819#discussion_r1570969645 From matsaave at openjdk.org Thu Apr 18 15:37:58 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 18 Apr 2024 15:37:58 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: <-5i_BDguO1qWOP0GnYK4pTeMMW4IhlV3LkqLPFs4vAw=.060c849a-de1a-4888-943e-80b9ed4eecf2@github.com> References: <-5i_BDguO1qWOP0GnYK4pTeMMW4IhlV3LkqLPFs4vAw=.060c849a-de1a-4888-943e-80b9ed4eecf2@github.com> Message-ID: On Wed, 17 Apr 2024 22:48:16 GMT, Dean Long wrote: > Did you consider minimizing changes by leaving decode_invokedynamic_index/encode_invokedynamic_index calls in place, but having the implementations not change the value? The intention from the start was to remove the encode/decode methods because they have been made unnecessary thanks to the changes mentioned in the description. As the author of the previously mentioned changes that overhauled the cpcache, this change should have been included in one of those PRs, but I must have forgotten! Leaving the calls in even if they did nothing would just make the code confusing and might become a red herring if other issues in the area come up. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18819#issuecomment-2064263944 From duke at openjdk.org Thu Apr 18 16:13:22 2024 From: duke at openjdk.org (Mikhail Ablakatov) Date: Thu, 18 Apr 2024 16:13:22 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 16 Apr 2024 12:28:14 GMT, Andrew Haley wrote: >> I can re-check and post the performance numbers here per a request. > Please do. Please also post the code. @theRealAph , you may find the performance numbers and the code in https://github.com/mikabl-arm/jdk/commit/f844b116f1a01653f127238d3a258cd2da4e1aca ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2064390547 From matsaave at openjdk.org Thu Apr 18 16:22:31 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 18 Apr 2024 16:22:31 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding [v2] In-Reply-To: References: Message-ID: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Dean and Gilles comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18819/files - new: https://git.openjdk.org/jdk/pull/18819/files/87926aee..3ef92512 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18819&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18819&range=00-01 Stats: 6 lines in 2 files changed: 0 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18819.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18819/head:pull/18819 PR: https://git.openjdk.org/jdk/pull/18819 From matsaave at openjdk.org Thu Apr 18 16:22:31 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 18 Apr 2024 16:22:31 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 22:41:08 GMT, Dean Long wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Dean and Gilles comments > > src/hotspot/share/ci/ciEnv.cpp line 1513: > >> 1511: // process the BSM >> 1512: int pool_index = indy_info->constant_pool_index(); >> 1513: BootstrapInfo bootstrap_specifier(cp, pool_index, indy_index); > > Why not just change the incoming parameter name to `index`? `indy_index` is used frequently even in functions that are only used in the context of invokedynamic. I think it helps with clarity. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18819#discussion_r1571043673 From sgehwolf at openjdk.org Thu Apr 18 16:38:56 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Thu, 18 Apr 2024 16:38:56 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: <0gUVzigzVfEv1IWV9irog8S3hPme-Aux9fDUWjPO2wc=.fa1648a1-4714-4f74-acc0-22c4250490af@github.com> References: <0gUVzigzVfEv1IWV9irog8S3hPme-Aux9fDUWjPO2wc=.fa1648a1-4714-4f74-acc0-22c4250490af@github.com> Message-ID: On Thu, 18 Apr 2024 13:27:38 GMT, Jan Kratochvil wrote: > Could not we rename `is_containerized()` to `use_container_limit()` ? As that is the current only purpose of `is_containerized()`. I'm not sure. There is value to have `is_containerized()` like it would behave after this patch. Specifically the first table row difference in [your comment](https://github.com/openjdk/jdk/pull/18201#issuecomment-2063868908) concerns me. JVMs running in a container without limit wouldn't be detected as "containerized". That seems a large share of deployments to miss. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2064487567 From ayang at openjdk.org Thu Apr 18 17:03:05 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 18 Apr 2024 17:03:05 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v5] In-Reply-To: References: Message-ID: <7XQt7JXCo32enHNq_WaxNgR6zzzZeEzMtULEwnjKhB4=.0a708fda-f81b-4be1-a377-10c60ce3bea2@github.com> On Thu, 18 Apr 2024 13:49:19 GMT, Dan Heidinga wrote: >> Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. >> >> `make test-tier1` passed with this change > > Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused variable Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18823#pullrequestreview-2009435625 From kbarrett at openjdk.org Thu Apr 18 18:15:57 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 18 Apr 2024 18:15:57 GMT Subject: RFR: 8320522: Remove code related to `RegisterFinalizersAtInit` [v5] In-Reply-To: References: Message-ID: <70Y7stfAlmMLDU5B0aFx8IA429g_yflCWbKYC1Wg4DU=.1bf5bbcb-3059-4357-a78c-40e62a8a18ee@github.com> On Thu, 18 Apr 2024 13:49:19 GMT, Dan Heidinga wrote: >> Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. >> >> `make test-tier1` passed with this change > > Dan Heidinga has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused variable Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18823#pullrequestreview-2009573968 From sviswanathan at openjdk.org Thu Apr 18 18:29:00 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 18 Apr 2024 18:29:00 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff @vnkozlov Could you please review this PR from @asgibbons? Looking forward to your inputs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2064852401 From pchilanomate at openjdk.org Thu Apr 18 19:19:01 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 18 Apr 2024 19:19:01 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: <6iJeqGCmcJpeoNq0w2XZ-uJwzmTrMGLDKZd_1JLkrfc=.91b9c9ae-d03d-4179-9e88-0ab2e409904a@github.com> References: <6iJeqGCmcJpeoNq0w2XZ-uJwzmTrMGLDKZd_1JLkrfc=.91b9c9ae-d03d-4179-9e88-0ab2e409904a@github.com> Message-ID: On Thu, 18 Apr 2024 05:38:24 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. >> >> Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. >> >> Running JVMTI reproducer. Starting full testing soon. > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Whitespace > - Spelling and typos I was playing with a reproducer, maybe it would be good to add it: https://github.com/pchilano/jdk/commit/77a85a0ea0dd650929799d8546322d31b69f92e6 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2065011998 From pchilanomate at openjdk.org Thu Apr 18 19:19:02 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 18 Apr 2024 19:19:02 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: References: Message-ID: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> On Tue, 16 Apr 2024 03:04:49 GMT, Dean Long wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: >> >> - Whitespace >> - Spelling and typos > > src/hotspot/share/runtime/deoptimization.cpp line 451: > >> 449: if (!is_syncronized_entry && bc != Bytecodes::Code::_monitorenter) { >> 450: deoptee_thread->lock_stack().verify_consistent_lock_order(lock_order, exec_mode != Deoptimization::Unpack_none); >> 451: } > > The above checks would also hit the follow false positives: > 1. deopt in counter overflow in prologue, not in monitorenter > 2. monitorenter at bci 0 when raw_bci is -1 (assuming it got past the verifier) > but seems mostly harmless to skip checks in those cases. I thought the original check was fine. Could you elaborate on these 2 cases, I didn't really get them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1571250149 From dlong at openjdk.org Thu Apr 18 21:52:59 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 18 Apr 2024 21:52:59 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 21:26:06 GMT, Ioi Lam wrote: > `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). > > This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: > > This simple c++ file illustrates the problem: > > enum class CompileCommand { a, b, c }; > void foo(CompileCommand x) {} > char* CompileCommand; // can no longer use "CompileCommand" as a type > void good(enum CompileCommand x) {} > void bad(CompileCommand x) {} > > $ g++ -c ~/enum.cpp > /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void > 5 | void bad(CompileCommand x) {} > > > The fix is to rename the enum type to `CompileCommandEnum`. > > This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. Marked as reviewed by dlong (Reviewer). The char* CompileCommand for the flag seems to be used much less than the enum (I could only find it used once). So if we could rename that variable, that would be a simpler solution. But it looks like that would require some advanced macro tricks in DECLARE_FLAGS to allow customization on a per-flag basis. ------------- PR Review: https://git.openjdk.org/jdk/pull/18829#pullrequestreview-2009977919 PR Comment: https://git.openjdk.org/jdk/pull/18829#issuecomment-2065381959 From coleenp at openjdk.org Thu Apr 18 22:00:09 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 18 Apr 2024 22:00:09 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError Message-ID: It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. Tested with tier1-4. ------------- Commit messages: - 8330578: The VM creates instance of abstract class VirtualMachineError Changes: https://git.openjdk.org/jdk/pull/18847/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330578 Stats: 22 lines in 9 files changed: 3 ins; 9 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From iklam at openjdk.org Thu Apr 18 22:23:04 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 18 Apr 2024 22:23:04 GMT Subject: RFR: 8314846: Do not store Klass::_secondary_super_cache in CDS archive Message-ID: This bug was found during Leyden development. CDS's `ArchiveBuilder` expects the class metadata to stop mutating while we're inside the CDS dumping safepoint. However, `Klass::_secondary_super_cache` can be updated as a side effect of `Klass::is_subtype_of()`. Currently, we don't call `Klass::is_subtype_of()`inside the CDS safepoint. However, it's likely that future optimizations will make such calls (as being done in the Leyden prototype). When that happens, the CDS dump will fail with a hard-to-debug failure (some class is found inside `_secondary_super_cache` that `ArchiveBuilder` doesn't know about. There's no benefit in storing `Klass::_secondary_super_cache` in the CDS archive. So the safest thing to do is to stop scanning it during CDS dumping, and clear it to `nullptr` when the `Klass` is stored in the CDS archive. ------------- Commit messages: - 8314846: Do not store Klass::_secondary_super_cache in CDS archive Changes: https://git.openjdk.org/jdk/pull/18848/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18848&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8314846 Stats: 6 lines in 1 file changed: 5 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18848.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18848/head:pull/18848 PR: https://git.openjdk.org/jdk/pull/18848 From iklam at openjdk.org Fri Apr 19 00:12:00 2024 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 19 Apr 2024 00:12:00 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError In-Reply-To: References: Message-ID: <8xnrMXls7WBNDuj3VnnzXVaCMXks3uYcDiXWnaIB4Rc=.51c7e158-c3c2-4dfc-8c93-26e171605590@github.com> On Thu, 18 Apr 2024 21:46:36 GMT, Coleen Phillimore wrote: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. LGTM src/hotspot/share/classfile/verifier.cpp line 156: > 154: } > 155: > 156: Inadvertent new line? ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18847#pullrequestreview-2010166787 PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1571539678 From iklam at openjdk.org Fri Apr 19 00:15:17 2024 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 19 Apr 2024 00:15:17 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum [v2] In-Reply-To: References: Message-ID: <1AIE8cAUAE2tbkBIeFxMpAWw3YrGjpWWislL_vHyfDY=.d815a856-4ffa-44b1-b9d4-ba73a3ece795@github.com> > `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). > > This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: > > This simple c++ file illustrates the problem: > > enum class CompileCommand { a, b, c }; > void foo(CompileCommand x) {} > char* CompileCommand; // can no longer use "CompileCommand" as a type > void good(enum CompileCommand x) {} > void bad(CompileCommand x) {} > > $ g++ -c ~/enum.cpp > /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void > 5 | void bad(CompileCommand x) {} > > > The fix is to rename the enum type to `CompileCommandEnum`. > > This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into 8330540-rename-CompileCommand-to-CompileCommandEnum - 8330540: Rename the enum type CompileCommand to CompileCommandEnum ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18829/files - new: https://git.openjdk.org/jdk/pull/18829/files/b60a498f..9d1191b8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18829&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18829&range=00-01 Stats: 15582 lines in 205 files changed: 4213 ins; 10559 del; 810 mod Patch: https://git.openjdk.org/jdk/pull/18829.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18829/head:pull/18829 PR: https://git.openjdk.org/jdk/pull/18829 From dlong at openjdk.org Fri Apr 19 00:36:02 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 19 Apr 2024 00:36:02 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 21:46:36 GMT, Coleen Phillimore wrote: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Marked as reviewed by dlong (Reviewer). src/hotspot/share/memory/universe.cpp line 1095: > 1093: tty->print_cr("Unable to link/verify VirtualMachineError class"); > 1094: return false; // initialization failed > 1095: } Do we still need to link VirtualMachineError above? ------------- PR Review: https://git.openjdk.org/jdk/pull/18847#pullrequestreview-2010235137 PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1571554370 From fyang at openjdk.org Fri Apr 19 00:42:56 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 19 Apr 2024 00:42:56 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: <-asV_LnZo_fwbBDtiNCAgBg7Oky_CC82pCvsQnUfNCY=.319d35aa-c357-47f4-ab50-a79bdbe3b352@github.com> On Thu, 18 Apr 2024 13:26:32 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `/integrate` the pr, but in fact the deletion happened before it's indeed integrated by github, so it caused the withdraw. > Sorry for the inconvenience. > > Thanks Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18839#pullrequestreview-2010254669 From jwaters at openjdk.org Fri Apr 19 00:43:55 2024 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 19 Apr 2024 00:43:55 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 21:46:36 GMT, Coleen Phillimore wrote: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Looks good! ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/18847#pullrequestreview-2010257654 From aboldtch at openjdk.org Fri Apr 19 05:26:26 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Apr 2024 05:26:26 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v4] In-Reply-To: References: Message-ID: <2DH3TX6JwbtsiLSzKd6MKQmRDjx_qlK48AlQiTQhZWE=.f0095950-1341-45a9-9a3e-ab37ddda5202@github.com> > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Tested Tier 1-8 + Stress testing reproducers. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: repro for JDK-8330253 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/bf49a93f..10d70ea1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=02-03 Stats: 63 lines in 1 file changed: 63 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From aboldtch at openjdk.org Fri Apr 19 05:26:26 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 19 Apr 2024 05:26:26 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v3] In-Reply-To: <9ZSpg74Yg5oF2NN-0YBFR-Ognpg-QkC-UhTyLteey5E=.ef07b041-f9f1-460a-8c3b-3ef8ac7e3636@github.com> References: <6iJeqGCmcJpeoNq0w2XZ-uJwzmTrMGLDKZd_1JLkrfc=.91b9c9ae-d03d-4179-9e88-0ab2e409904a@github.com> <9ZSpg74Yg5oF2NN-0YBFR-Ognpg-QkC-UhTyLteey5E=.ef07b041-f9f1-460a-8c3b-3ef8ac7e3636@github.com> Message-ID: <43AqwG4Qiso63kOkEMC-3WJkiFW51e7xoIxyqCwf4ys=.4a815b39-7942-4c77-a112-39b19e609eb6@github.com> On Thu, 18 Apr 2024 14:58:52 GMT, Daniel D. Daugherty wrote: > Thumbs up on the changes themselves. > > Unless I missed it, I haven't seen updated info on the pre-integration testing in general and specifically about whether Tier8 was executed. Just finished Tier8. Also stress tested the reproducer (both @pchilano and the reproducers from Tier8) > I was playing with a reproducer, maybe it would be good to add it: [pchilano at 77a85a0](https://github.com/pchilano/jdk/commit/77a85a0ea0dd650929799d8546322d31b69f92e6) Great thanks, I'll add it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2065783940 From rehn at openjdk.org Fri Apr 19 06:28:59 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 19 Apr 2024 06:28:59 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v5] In-Reply-To: References: Message-ID: <1Mh57ZevqKeEezgliYQ40K6HoS-Tgn2dhrlar1shc_U=.ba6775ac-98e4-404d-b75c-b73ebd037b9d@github.com> On Thu, 18 Apr 2024 14:19:29 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Added comment > - Rename I'll go ahead and integrate if there was nothing else. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18755#issuecomment-2065844573 From stefank at openjdk.org Fri Apr 19 06:45:04 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 19 Apr 2024 06:45:04 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Thu, 18 Apr 2024 14:27:37 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > more improvements in style/alignments/adjustments. Changes requested by stefank (Reviewer). src/hotspot/os/windows/os_windows.cpp line 3401: > 3399: // Reserve memory at an arbitrary address, only if that area is > 3400: // available (and not reserved for something else). > 3401: char* os::pd_attempt_reserve_memory_at(char* addr, size_t bytes, bool exec, MEMFLAGS nmt_flag) { Most of the time the MEMFLAGS parameters are called flag but in some places they are called nmt_flag. Could they always be called flag? This might require some minor renames in some files, but I think that would be OK. src/hotspot/share/cds/filemap.cpp line 1726: > 1724: char *base = os::map_memory(_fd, _full_path, r->file_offset(), //file info > 1725: addr, size, //memory info > 1726: false /* !read_only */, r->allow_exec(), mtClassShared); // flags Your new comments are not aligned properly. However, I still don't think you should do this structural change. It makes the code inconsistent with the code below. If you do want to do it, I think you should change the rest of the code as well. But I don't think you should do that in this RFE, so I'd prefer if you just change this to: Suggestion: char *base = os::map_memory(_fd, _full_path, r->file_offset(), addr, size, false /* !read_only */, r->allow_exec(), mtClassShared); src/hotspot/share/gc/shared/cardTable.cpp line 87: > 85: ReservedSpace heap_rs(_byte_map_size, rs_align, _page_size, mtGC); > 86: > 87: os::trace_page_sizes("Card Table", num_bytes, num_bytes, The indention is now messed up here. src/hotspot/share/gc/shared/cardTable.cpp line 176: > 174: old_committed.word_size() - new_committed.word_size()); > 175: bool res = os::uncommit_memory((char*)delta.start(), > 176: delta.byte_size(), !ExecMem, mtGCCardSet); Suggestion: delta.byte_size(), !ExecMem, mtGCCardSet); src/hotspot/share/memory/virtualspace.hpp line 180: > 178: bool _executable; > 179: > 180: MEMFLAGS _nmt_flag; Don't place the variable against the comment below. Suggestion: MEMFLAGS _nmt_flag; src/hotspot/share/memory/virtualspace.hpp line 199: > 197: size_t _upper_alignment; > 198: > 199: Stray blankline src/hotspot/share/nmt/virtualMemoryTracker.hpp line 306: > 304: VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(mtNone) { } > 305: > 306: ReservedMemoryRegion(address base, size_t size, MEMFLAGS flag) : (Commenting here because GitHub doesn't allow me to add comments to unchanged lines). I'd like to see a follow-up RFE that makes ReservedMemoryRegion(address base, size_t size) : VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(mtNone) { } a private constructor, so that it is only used for our find operations and never accidentally used for other code. src/hotspot/share/runtime/os.cpp line 2185: > 2183: char* os::map_memory(int fd, const char* file_name, size_t file_offset, // file info > 2184: char *addr, size_t bytes, // memory info > 2185: bool read_only, bool allow_exec, MEMFLAGS flags) { // flags I find it odd to make this documentation and separation only for this function. Could you remove that for this PR? Suggestion: char* os::map_memory(int fd, const char* file_name, size_t file_offset, char *addr, size_t bytes, bool read_only bool allow_exec, MEMFLAGS flags) { src/hotspot/share/runtime/os.hpp line 232: > 230: static char* pd_map_memory(int fd, const char* file_name, size_t file_offset, > 231: char *addr, size_t bytes, > 232: bool read_only = false, bool allow_exec = false); I'd like to see this reverted. src/hotspot/share/runtime/os.hpp line 518: > 516: static char* map_memory(int fd, const char* file_name, size_t file_offset, // file info > 517: char *addr, size_t bytes, // memory info > 518: bool read_only, bool allow_exec, MEMFLAGS flag); // flags Suggestion: static char* map_memory(int fd, const char* file_name, size_t file_offset, char *addr, size_t bytes, bool read_only, bool allow_exec, MEMFLAGS flag); ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2010616842 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571879226 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571880120 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571885564 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571886492 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571892287 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571892588 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571896273 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571898696 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571899968 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1571901493 From stuefe at openjdk.org Fri Apr 19 06:53:56 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 19 Apr 2024 06:53:56 GMT Subject: RFR: 8314846: Do not store Klass::_secondary_super_cache in CDS archive In-Reply-To: References: Message-ID: <9dNGixgm_3Wd7oMp_nnIhaxPYcFh9TBZZwCWcvMlQfA=.84222e0f-9f6a-4197-ba37-af95ac3124a3@github.com> On Thu, 18 Apr 2024 22:19:31 GMT, Ioi Lam wrote: > This bug was found during Leyden development. > > CDS's `ArchiveBuilder` expects the class metadata to stop mutating while we're inside the CDS dumping safepoint. However, `Klass::_secondary_super_cache` can be updated as a side effect of `Klass::is_subtype_of()`. > > Currently, we don't call `Klass::is_subtype_of()`inside the CDS safepoint. However, it's likely that future optimizations will make such calls (as being done in the Leyden prototype). When that happens, the CDS dump will fail with a hard-to-debug failure (some class is found inside `_secondary_super_cache` that `ArchiveBuilder` doesn't know about. > > There's no benefit in storing `Klass::_secondary_super_cache` in the CDS archive. So the safest thing to do is to stop scanning it during CDS dumping, and clear it to `nullptr` when the `Klass` is stored in the CDS archive. Makes sense. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18848#pullrequestreview-2010653912 From mli at openjdk.org Fri Apr 19 07:37:56 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 19 Apr 2024 07:37:56 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <-asV_LnZo_fwbBDtiNCAgBg7Oky_CC82pCvsQnUfNCY=.319d35aa-c357-47f4-ab50-a79bdbe3b352@github.com> References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> <-asV_LnZo_fwbBDtiNCAgBg7Oky_CC82pCvsQnUfNCY=.319d35aa-c357-47f4-ab50-a79bdbe3b352@github.com> Message-ID: On Fri, 19 Apr 2024 00:39:50 GMT, Fei Yang wrote: >> Hi, >> Can you help to review this patch? >> It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `/integrate` the pr, but in fact the deletion happened before it's indeed integrated by github, so it caused the withdraw. >> Sorry for the inconvenience. >> >> Thanks > > Marked as reviewed by fyang (Reviewer). Thanks @RealFYang for your quick review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18839#issuecomment-2065958167 From rrich at openjdk.org Fri Apr 19 07:46:58 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 19 Apr 2024 07:46:58 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 12 Apr 2024 14:40:05 GMT, Sergey Nazarkin wrote: > An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). > > Additional testing: > - [x] MacOS AArch64 server fastdebug *gtets* > - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* > - [ ] Benchmarking > > @apangin and @parttimenerd could you please check the patch on your scenarios?? What about granting `WXWrite` only if the current thread is in `_thread_in_vm`? That would be more restrictive and roughly equivalent how it currently works. Likely there are some places then that should be granted `WXWrite` eagerly because they need `WXWrite` without `_thread_in_vm`. E.g. the JIT compiler threads should have `WXWrite` and never `WXExec` (I assume) which should be checked in the signal handler. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2065983074 From azafari at openjdk.org Fri Apr 19 08:18:00 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 08:18:00 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 06:40:35 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> more improvements in style/alignments/adjustments. > > src/hotspot/share/runtime/os.hpp line 232: > >> 230: static char* pd_map_memory(int fd, const char* file_name, size_t file_offset, >> 231: char *addr, size_t bytes, >> 232: bool read_only = false, bool allow_exec = false); > > I'd like to see this reverted. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572006493 From azafari at openjdk.org Fri Apr 19 08:33:16 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 08:33:16 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v11] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: more on alignments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/897b4b30..33f2cf69 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=09-10 Stats: 19 lines in 6 files changed: 3 ins; 1 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Fri Apr 19 08:33:17 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 08:33:17 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 06:23:27 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> more improvements in style/alignments/adjustments. > > src/hotspot/os/windows/os_windows.cpp line 3401: > >> 3399: // Reserve memory at an arbitrary address, only if that area is >> 3400: // available (and not reserved for something else). >> 3401: char* os::pd_attempt_reserve_memory_at(char* addr, size_t bytes, bool exec, MEMFLAGS nmt_flag) { > > Most of the time the MEMFLAGS parameters are called flag but in some places they are called nmt_flag. Could they always be called flag? This might require some minor renames in some files, but I think that would be OK. `reserve_large_pages_individually` has a local `flags` variable and `allocate_pages_individually` has a `flags` param in their definitions. So, the `nmt_flag` is used to avoid confusing different flags while reading the code. Other cases where there would be no ambiguity, the `nmt_flag` is renamed to `flag`. > src/hotspot/share/cds/filemap.cpp line 1726: > >> 1724: char *base = os::map_memory(_fd, _full_path, r->file_offset(), //file info >> 1725: addr, size, //memory info >> 1726: false /* !read_only */, r->allow_exec(), mtClassShared); // flags > > Your new comments are not aligned properly. > > However, I still don't think you should do this structural change. It makes the code inconsistent with the code below. If you do want to do it, I think you should change the rest of the code as well. But I don't think you should do that in this RFE, so I'd prefer if you just change this to: > Suggestion: > > char *base = os::map_memory(_fd, _full_path, r->file_offset(), > addr, size, false /* !read_only */, > r->allow_exec(), mtClassShared); Done. > src/hotspot/share/gc/shared/cardTable.cpp line 87: > >> 85: ReservedSpace heap_rs(_byte_map_size, rs_align, _page_size, mtGC); >> 86: >> 87: os::trace_page_sizes("Card Table", num_bytes, num_bytes, > > The indention is now messed up here. Fixed. > src/hotspot/share/gc/shared/cardTable.cpp line 176: > >> 174: old_committed.word_size() - new_committed.word_size()); >> 175: bool res = os::uncommit_memory((char*)delta.start(), >> 176: delta.byte_size(), !ExecMem, mtGCCardSet); > > Suggestion: > > delta.byte_size(), > !ExecMem, > mtGCCardSet); Done. > src/hotspot/share/memory/virtualspace.hpp line 180: > >> 178: bool _executable; >> 179: >> 180: MEMFLAGS _nmt_flag; > > Don't place the variable against the comment below. > Suggestion: > > MEMFLAGS _nmt_flag; Done. > src/hotspot/share/memory/virtualspace.hpp line 199: > >> 197: size_t _upper_alignment; >> 198: >> 199: > > Stray blankline Fixed. > src/hotspot/share/runtime/os.cpp line 2185: > >> 2183: char* os::map_memory(int fd, const char* file_name, size_t file_offset, // file info >> 2184: char *addr, size_t bytes, // memory info >> 2185: bool read_only, bool allow_exec, MEMFLAGS flags) { // flags > > I find it odd to make this documentation and separation only for this function. Could you remove that for this PR? > Suggestion: > > char* os::map_memory(int fd, const char* file_name, size_t file_offset, > char *addr, size_t bytes, bool read_only > bool allow_exec, MEMFLAGS flags) { Done. > src/hotspot/share/runtime/os.hpp line 518: > >> 516: static char* map_memory(int fd, const char* file_name, size_t file_offset, // file info >> 517: char *addr, size_t bytes, // memory info >> 518: bool read_only, bool allow_exec, MEMFLAGS flag); // flags > > Suggestion: > > static char* map_memory(int fd, const char* file_name, size_t file_offset, > char *addr, size_t bytes, bool read_only, > bool allow_exec, MEMFLAGS flag); Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572020983 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572021204 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572021411 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572021706 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572021948 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572022104 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572024667 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572024863 From aph at openjdk.org Fri Apr 19 08:41:57 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 19 Apr 2024 08:41:57 GMT Subject: RFR: 8314846: Do not store Klass::_secondary_super_cache in CDS archive In-Reply-To: References: Message-ID: <1UuPjroDQZJnTOcSIJP6ZsCnyaOLFqWERwABUW5Gof0=.998de769-5072-47a2-809c-16d2ff13dc8e@github.com> On Thu, 18 Apr 2024 22:19:31 GMT, Ioi Lam wrote: > This bug was found during Leyden development. > > CDS's `ArchiveBuilder` expects the class metadata to stop mutating while we're inside the CDS dumping safepoint. However, `Klass::_secondary_super_cache` can be updated as a side effect of `Klass::is_subtype_of()`. > > Currently, we don't call `Klass::is_subtype_of()`inside the CDS safepoint. However, it's likely that future optimizations will make such calls (as being done in the Leyden prototype). When that happens, the CDS dump will fail with a hard-to-debug failure (some class is found inside `_secondary_super_cache` that `ArchiveBuilder` doesn't know about. > > There's no benefit in storing `Klass::_secondary_super_cache` in the CDS archive. So the safest thing to do is to stop scanning it during CDS dumping, and clear it to `nullptr` when the `Klass` is stored in the CDS archive. Looks good. Hopefully `_secondary_super_cache` will soon be gone, but it'll be a while before all ports are done. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18848#pullrequestreview-2010867541 From azafari at openjdk.org Fri Apr 19 08:47:00 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 08:47:00 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 06:37:41 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> more improvements in style/alignments/adjustments. > > src/hotspot/share/nmt/virtualMemoryTracker.hpp line 306: > >> 304: VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(mtNone) { } >> 305: >> 306: ReservedMemoryRegion(address base, size_t size, MEMFLAGS flag) : > > (Commenting here because GitHub doesn't allow me to add comments to unchanged lines). I'd like to see a follow-up RFE that makes > > ReservedMemoryRegion(address base, size_t size) : > VirtualMemoryRegion(base, size), _stack(NativeCallStack::empty_stack()), _flag(mtNone) { } > > a private constructor, so that it is only used for our find operations and never accidentally used for other code. https://bugs.openjdk.org/browse/JDK-8330627 is created for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572041557 From stefank at openjdk.org Fri Apr 19 08:51:00 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 19 Apr 2024 08:51:00 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 08:27:09 GMT, Afshin Zafari wrote: >> src/hotspot/os/windows/os_windows.cpp line 3401: >> >>> 3399: // Reserve memory at an arbitrary address, only if that area is >>> 3400: // available (and not reserved for something else). >>> 3401: char* os::pd_attempt_reserve_memory_at(char* addr, size_t bytes, bool exec, MEMFLAGS nmt_flag) { >> >> Most of the time the MEMFLAGS parameters are called flag but in some places they are called nmt_flag. Could they always be called flag? This might require some minor renames in some files, but I think that would be OK. > > `reserve_large_pages_individually` has a local `flags` variable and > `allocate_pages_individually` has a `flags` param in their definitions. So, the `nmt_flag` is used to avoid confusing different flags while reading the code. > Other cases where there would be no ambiguity, the `nmt_flag` is renamed to `flag`. Could you instead rename the local 'flags' variable instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572047252 From sgehwolf at openjdk.org Fri Apr 19 09:00:00 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 19 Apr 2024 09:00:00 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2] In-Reply-To: References: Message-ID: <7GjtMGXbf3rZWHSCnweq2vA_50PWZYL5aRslymkysP0=.fb73e955-0fe3-4a62-a428-ee63d9f819dd@github.com> On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true Thanks for your input Larry! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2066136268 From rehn at openjdk.org Fri Apr 19 09:00:57 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 19 Apr 2024 09:00:57 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: On Thu, 18 Apr 2024 13:26:32 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `/integrate` the pr, but in fact the deletion happened before it's indeed integrated by github, so it caused the withdraw. > Sorry for the inconvenience. > > Thanks Thanks! But the flag RestoreMXCSROnJNICalls is actually "x86 specific". We should add a new flag that alias this, such as "RestoreFPModeOnJNICalls". Or whatever is good generic name. ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18839#pullrequestreview-2010907026 From rehn at openjdk.org Fri Apr 19 09:03:04 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 19 Apr 2024 09:03:04 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Tue, 5 Dec 2023 09:44:24 GMT, Andrew Haley wrote: >>> @theRealAph the `RestoreMXCSROnJNICalls` flag is a product flag not diagnostic. >> >> Ah, thanks, >> >>> Aliased flags are setup in arguments.cpp by editing this: >> >> OK. How about we split this into two, this first part without a CSR, and the second part, which creates the generic alias, with one? That way we can mitigate a live problem in this release. > >> > @theRealAph the `RestoreMXCSROnJNICalls` flag is a product flag not diagnostic. >> >> Ah, thanks, >> >> > Aliased flags are setup in arguments.cpp by editing this: >> >> OK. How about we split this into two, this first part without a CSR, and the second part, which creates the generic alias, with one? That way we can mitigate a live problem in this release. > > Please? One day left. @theRealAph IMHO we should add a new flag aliasing the old. Then deprecate MXCSR flag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16851#issuecomment-2066140616 From mli at openjdk.org Fri Apr 19 09:08:57 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 19 Apr 2024 09:08:57 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: On Thu, 18 Apr 2024 13:26:32 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `/integrate` the pr, but in fact the deletion happened before it's indeed integrated by github, so it caused the withdraw. > Sorry for the inconvenience. > > Thanks In fact, `RestoreMXCSROnJNICalls` also used on aarch64. I agree it's not a good name, I think it's better to have a uniform name across platforms, so maybe we should modify the name together in another pr after this pr? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18839#issuecomment-2066146833 From rehn at openjdk.org Fri Apr 19 09:08:59 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 19 Apr 2024 09:08:59 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: On Fri, 19 Apr 2024 09:04:00 GMT, Hamlin Li wrote: > In fact, `RestoreMXCSROnJNICalls` also used on aarch64. I agree it's not a good name, I think it's better to have a uniform name across platforms, so maybe we should modify the name together in another pr after this pr? Yes, I know. Andrew pointed that out in aarch64 PR. Yes, we should add a new flag aliasing the old flag. (see argument.cpp) Then start the processes of deprecating the old flag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18839#issuecomment-2066150874 From azafari at openjdk.org Fri Apr 19 09:09:29 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 09:09:29 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v12] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: local/param `flags` renamed to `alloc_type` to let have `MEMFLAGS flag` param. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/33f2cf69..2989e3a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=10-11 Stats: 12 lines in 1 file changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Fri Apr 19 09:09:29 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 09:09:29 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v10] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 08:48:36 GMT, Stefan Karlsson wrote: >> `reserve_large_pages_individually` has a local `flags` variable and >> `allocate_pages_individually` has a `flags` param in their definitions. So, the `nmt_flag` is used to avoid confusing different flags while reading the code. >> Other cases where there would be no ambiguity, the `nmt_flag` is renamed to `flag`. > > Could you instead rename the local 'flags' variable instead? Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572069037 From mli at openjdk.org Fri Apr 19 09:31:57 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 19 Apr 2024 09:31:57 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: On Thu, 18 Apr 2024 13:26:32 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `/integrate` the pr, but in fact the deletion happened before it's indeed integrated by github, so it caused the withdraw. > Sorry for the inconvenience. > > Thanks OK, I created https://bugs.openjdk.org/browse/JDK-8330634 to track RestoreMXCSROnJNICalls renaming ------------- PR Comment: https://git.openjdk.org/jdk/pull/18839#issuecomment-2066189757 From stefank at openjdk.org Fri Apr 19 09:36:02 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 19 Apr 2024 09:36:02 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v12] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:09:29 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > local/param `flags` renamed to `alloc_type` to let have `MEMFLAGS flag` param. There's one tiny nit left, from my POV, otherwise this looks good to me. src/hotspot/share/memory/virtualspace.cpp line 709: > 707: assert(max_commit_granularity > 0, "Granularity must be non-zero."); > 708: > 709: This blankline should be reverted. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2010969083 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572099614 From mli at openjdk.org Fri Apr 19 09:37:56 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 19 Apr 2024 09:37:56 GMT Subject: RFR: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: On Fri, 19 Apr 2024 09:29:06 GMT, Hamlin Li wrote: > OK, I created https://bugs.openjdk.org/browse/JDK-8330634 to track RestoreMXCSROnJNICalls renaming Oh, I found https://bugs.openjdk.org/browse/JDK-8321535 is already created, and I think Andrew will work on it. Maybe I should close https://bugs.openjdk.org/browse/JDK-8330634 as duplicate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18839#issuecomment-2066199388 From azafari at openjdk.org Fri Apr 19 09:49:33 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 09:49:33 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: removed extra blank line. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/2989e3a3..fa350261 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=11-12 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From azafari at openjdk.org Fri Apr 19 09:49:34 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 19 Apr 2024 09:49:34 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v12] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <5yX9I1JoQY9gmxbIvTDPxuxQSu37KHG0LzlL7cq-3iQ=.38c06bf3-699b-466c-b934-aefedb37b17b@github.com> On Fri, 19 Apr 2024 09:31:12 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> local/param `flags` renamed to `alloc_type` to let have `MEMFLAGS flag` param. > > src/hotspot/share/memory/virtualspace.cpp line 709: > >> 707: assert(max_commit_granularity > 0, "Granularity must be non-zero."); >> 708: >> 709: > > This blankline should be reverted. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1572118687 From rehn at openjdk.org Fri Apr 19 10:10:09 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 19 Apr 2024 10:10:09 GMT Subject: Integrated: 8330156: RISC-V: Range check auipc + signed 12 imm instruction In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 10:41:39 GMT, Robbin Ehn wrote: > Hi please consider! > > Today we check if the distance is a signed 32. > As the second instruction have sign bit + 11 bits the, max of such pair is shorter. > > Sanity tested This pull request has now been integrated. Changeset: 8990864a Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/8990864a53fa04f44ecf8bff65a6dc9cdd67cb1c Stats: 13 lines in 2 files changed: 8 ins; 0 del; 5 mod 8330156: RISC-V: Range check auipc + signed 12 imm instruction Reviewed-by: fyang, mli, tonyp ------------- PR: https://git.openjdk.org/jdk/pull/18755 From mli at openjdk.org Fri Apr 19 10:12:05 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 19 Apr 2024 10:12:05 GMT Subject: Integrated: 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI In-Reply-To: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> References: <7xjdc7l8n2qGfTUsAQVn54ApxGzqwnZGhsaXi1-6Ce8=.6bb2efe2-ee25-4e60-921e-0523154b4761@github.com> Message-ID: On Thu, 18 Apr 2024 13:26:32 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > It's exactly the same as https://github.com/openjdk/jdk/pull/18785 which is withdrawn, the reason could be that I deleted the branch `restore-frm-after-jni` after I `/integrate` the pr, but in fact the deletion happened before it's indeed integrated by github, so it caused the withdraw. > Sorry for the inconvenience. > > Thanks This pull request has now been integrated. Changeset: 85261bce Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/85261bcebc1903d9f05523bfb9c1b25d7f1fd8b6 Stats: 25 lines in 5 files changed: 25 ins; 0 del; 0 mod 8330266: RISC-V: Restore frm to RoundingMode::rne after JNI Reviewed-by: fyang, rehn ------------- PR: https://git.openjdk.org/jdk/pull/18839 From jsjolen at openjdk.org Fri Apr 19 11:01:19 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 19 Apr 2024 11:01:19 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v39] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Just give the stack some random size_ts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/8e3fd751..b66ba2ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=38 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=37-38 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 19 11:52:01 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 19 Apr 2024 11:52:01 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 08:20:49 GMT, Thomas Stuefe wrote: >> The acceptability of recursion fully depends on the tree being balanced as this leads to a call stack depth of O(log n). split and merge have recursive self-calls in non-tail positions, so I doubt that this will be optimized out. >> >> A treap with a good RNG cannot be worse than something like `4*log_2(n)`. >> >> I see two ways forward: >> >> 1. There are iterative ways of creating a Treap, we could use those. It's a bit more work. I'll have to do research. >> 2. Reify the callstack as a heap-allocated linked list of activation frames and run the code on that. This modifies the linear code to pushing activation frames onto a stack and a bit. >> >> I'll get back here with an example of what that looks like. > >> The acceptability of recursion fully depends on the tree being balanced as this leads to a call stack depth of O(log n). split and merge have recursive self-calls in non-tail positions, so I doubt that this will be optimized out. > > Yeah that's what I meant. Meant to say "cannot". > >> >> A treap with a good RNG cannot be worse than something like `4*log_2(n)`. >> >> I see two ways forward: >> >> 1. There are iterative ways of creating a Treap, we could use those. It's a bit more work. I'll have to do research. >> >> 2. Reify the callstack as a heap-allocated linked list of activation frames and run the code on that. This modifies the linear code to pushing activation frames onto a stack and a bit. >> >> >> I'll get back here with an example of what that looks like. > > There is an argument to be made for simplicity too. If you think degeneration is super improbable, it may be okay. But we should probably assert at some depth rather than rely on the potentially missing stack guard. I looked around a bit and found a formula for computing the probability of the depth of a treap reaching some particular number `H`. This is assuming a truly random number generation for the priorities. Due to slide 32 in [0]. Both `merge` and `split` have a recursive call depth bounded by the depth of the treap. Let's assume that a call stack depth of 200 is acceptable. Each call takes 16 bytes or so of stack space (just counting inputs here). let's say I have woefully underestimated it and it's actually 256 bytes per call. That is `(256 * 200) / 1024 = 50.0KiB` of stack space required. That is `~10%` of a `512KiB` stack, which is the default on non-AMD64 platforms. Here's a transcript of an interactive Python session I used to compute the probability a call stack depth of 200 when we have a billion nodes (`N = 1_000_000_000`): >>> H = 200 >>> N = 1_000_000_000 >>> math.log(N) 20.72326583694641 >>> H / 21 9.523809523809524 >>> H / 21 / 2 4.761904761904762 >>> c = H / 21 / 2 >>> 2 * c * math.log(N) 197.36443654234677 # Close enough, depth of 197 >>> N * ( (N/math.e) ** (-1 * c * math.log(c/math.e))) 1.3542460158348683e-14 >>> A probability of `1.35 * 10**(-14)`, that is extremely improbable. Unless I messed something up in my calculations or code, I think that we're good. Still, I'll add a `DEBUG_ONLY` call stack depth counter and assert on it. [0] https://cseweb.ucsd.edu//~kube/cls/100/Lectures/lec8.treap/lec8.pdf ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1572252187 From stefank at openjdk.org Fri Apr 19 11:57:02 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 19 Apr 2024 11:57:02 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2011231210 From coleenp at openjdk.org Fri Apr 19 12:09:07 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 12:09:07 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v2] In-Reply-To: References: Message-ID: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove newline ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18847/files - new: https://git.openjdk.org/jdk/pull/18847/files/8bf59f98..92e4f015 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From coleenp at openjdk.org Fri Apr 19 12:09:08 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 12:09:08 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v2] In-Reply-To: <8xnrMXls7WBNDuj3VnnzXVaCMXks3uYcDiXWnaIB4Rc=.51c7e158-c3c2-4dfc-8c93-26e171605590@github.com> References: <8xnrMXls7WBNDuj3VnnzXVaCMXks3uYcDiXWnaIB4Rc=.51c7e158-c3c2-4dfc-8c93-26e171605590@github.com> Message-ID: On Fri, 19 Apr 2024 00:06:57 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove newline > > src/hotspot/share/classfile/verifier.cpp line 156: > >> 154: } >> 155: >> 156: > > Inadvertent new line? yes, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1572267953 From coleenp at openjdk.org Fri Apr 19 12:09:09 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 12:09:09 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v2] In-Reply-To: References: Message-ID: <_jj49cLIcC2jUBoo3enfQKqY6nOlC1_B7w4UBvbWOKU=.c7bf011b-81d2-4ee0-8214-c6f01022f1e9@github.com> On Fri, 19 Apr 2024 00:33:06 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove newline > > src/hotspot/share/memory/universe.cpp line 1095: > >> 1093: tty->print_cr("Unable to link/verify VirtualMachineError class"); >> 1094: return false; // initialization failed >> 1095: } > > Do we still need to link VirtualMachineError above? Good question. I don't know if we have to link and initialize it. We need the Klass because there are subclass checks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1572268843 From jsjolen at openjdk.org Fri Apr 19 12:13:33 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 19 Apr 2024 12:13:33 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v40] In-Reply-To: References: Message-ID: <768GXMbG7wT-OvK7LHhK_2KkH1zTIptdi7mhu4Y3EIw=.ee9164ae-26d8-40da-a991-3cc2070a25a8@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with three additional commits since the last revision: - First go at a recur_count + test - Style again - CHeapAllocator -> TreapCHeapAllocator ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/b66ba2ca..6e1b4915 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=39 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=38-39 Stats: 43 lines in 2 files changed: 28 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From coleenp at openjdk.org Fri Apr 19 12:18:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 12:18:14 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v3] In-Reply-To: References: Message-ID: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: We don't need to link and initialize VirtualMachineError class because the lines just below it that link and initialize and create an instance of StackOverflowError will do that, since VME is a subclass of SOE. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18847/files - new: https://git.openjdk.org/jdk/pull/18847/files/92e4f015..9eaf7956 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=01-02 Stats: 9 lines in 1 file changed: 0 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From coleenp at openjdk.org Fri Apr 19 12:18:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 12:18:14 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v3] In-Reply-To: <_jj49cLIcC2jUBoo3enfQKqY6nOlC1_B7w4UBvbWOKU=.c7bf011b-81d2-4ee0-8214-c6f01022f1e9@github.com> References: <_jj49cLIcC2jUBoo3enfQKqY6nOlC1_B7w4UBvbWOKU=.c7bf011b-81d2-4ee0-8214-c6f01022f1e9@github.com> Message-ID: On Fri, 19 Apr 2024 12:05:43 GMT, Coleen Phillimore wrote: >> src/hotspot/share/memory/universe.cpp line 1095: >> >>> 1093: tty->print_cr("Unable to link/verify VirtualMachineError class"); >>> 1094: return false; // initialization failed >>> 1095: } >> >> Do we still need to link VirtualMachineError above? > > Good question. I don't know if we have to link and initialize it. We need the Klass because there are subclass checks. I put this in the commit comment. We don't need to explicitly do anything with VME in universe because the code just below this will link and initialize and create an instance of StackOverflowError and that'll link VME as a super class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1572279043 From rkennke at openjdk.org Fri Apr 19 12:31:11 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 19 Apr 2024 12:31:11 GMT Subject: RFR: 8330585: Refactor/rename forwardee handling Message-ID: In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. This change cleans it up. It introduces a bunch of APIs to facilitate that: - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). Testing: - [x] hotspot_gc - [x] tier1 ------------- Commit messages: - 8330585: Refactor/rename forwardee handling Changes: https://git.openjdk.org/jdk/pull/18863/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18863&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330585 Stats: 24 lines in 7 files changed: 12 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/18863.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18863/head:pull/18863 PR: https://git.openjdk.org/jdk/pull/18863 From dnsimon at openjdk.org Fri Apr 19 12:42:57 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 19 Apr 2024 12:42:57 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v3] In-Reply-To: References: Message-ID: <9YgiSsJeMWSEamHCz2NIKviVBytSX71wsvw23R415dQ=.d4739a4c-281e-44af-9904-d84aecd21498@github.com> On Fri, 19 Apr 2024 12:18:14 GMT, Coleen Phillimore wrote: >> It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > We don't need to link and initialize VirtualMachineError class because the lines just below it that link and initialize and create an instance of StackOverflowError will do that, since VME is a subclass of SOE. src/hotspot/share/classfile/verifier.cpp line 257: > 255: // or one of it's superclasses, we're in trouble and are going > 256: // to infinitely recurse when we try to initialize the exception. > 257: // So bail out here by throwing the preallocated VM error. The comment looks wrong now as I think `THROW_MSG_` creates a new object. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1572301454 From jsjolen at openjdk.org Fri Apr 19 12:56:31 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 19 Apr 2024 12:56:31 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v41] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: First go at a recur_count + test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/6e1b4915..4b51328c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=40 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=39-40 Stats: 19 lines in 2 files changed: 1 ins; 18 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Fri Apr 19 12:56:31 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 19 Apr 2024 12:56:31 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: <7axrnSHgHtbJTSsXYHVfnvXtPZjMJzXntqdFz-fRkV0=.b9aa28b7-cc92-4ad3-93c5-c5c666594f31@github.com> On Fri, 19 Apr 2024 11:49:22 GMT, Johan Sj?len wrote: >>> The acceptability of recursion fully depends on the tree being balanced as this leads to a call stack depth of O(log n). split and merge have recursive self-calls in non-tail positions, so I doubt that this will be optimized out. >> >> Yeah that's what I meant. Meant to say "cannot". >> >>> >>> A treap with a good RNG cannot be worse than something like `4*log_2(n)`. >>> >>> I see two ways forward: >>> >>> 1. There are iterative ways of creating a Treap, we could use those. It's a bit more work. I'll have to do research. >>> >>> 2. Reify the callstack as a heap-allocated linked list of activation frames and run the code on that. This modifies the linear code to pushing activation frames onto a stack and a bit. >>> >>> >>> I'll get back here with an example of what that looks like. >> >> There is an argument to be made for simplicity too. If you think degeneration is super improbable, it may be okay. But we should probably assert at some depth rather than rely on the potentially missing stack guard. > > I looked around a bit and found a formula for computing the probability of the depth of a treap reaching some particular number `H`. This is assuming a truly random number generation for the priorities. Due to slide 32 in [0]. Both `merge` and `split` have a recursive call depth bounded by the depth of the treap. > > Let's assume that a call stack depth of 200 is acceptable. Each call takes 16 bytes or so of stack space (just counting inputs here). let's say I have woefully underestimated it and it's actually 256 bytes per call. That is `(256 * 200) / 1024 = 50.0KiB` of stack space required. That is `~10%` of a `512KiB` stack, which is the default on non-AMD64 platforms. > > Here's a transcript of an interactive Python session I used to compute the probability a call stack depth of 200 when we have a billion nodes (`N = 1_000_000_000`): > > >>>> H = 200 >>>> N = 1_000_000_000 >>>> math.log(N) > 20.72326583694641 >>>> H / 21 > 9.523809523809524 >>>> H / 21 / 2 > 4.761904761904762 >>>> c = H / 21 / 2 >>>> 2 * c * math.log(N) > 197.36443654234677 # Close enough, depth of 197 >>>> N * ( (N/math.e) ** (-1 * c * math.log(c/math.e))) > 1.3542460158348683e-14 >>>> > > > > A probability of `1.35 * 10**(-14)`, that is extremely improbable. Unless I messed something up in my calculations or code, I think that we're good. Still, I'll add a `DEBUG_ONLY` call stack depth counter and assert on it. > > [0] https://cseweb.ucsd.edu//~kube/cls/100/Lectures/lec8.treap/lec8.pdf and as an empirical data point, the maximum stack depth of merge was 23 when upserting 1,000,000 elements. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1572321871 From coleenp at openjdk.org Fri Apr 19 13:09:31 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 13:09:31 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v4] In-Reply-To: References: Message-ID: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Throw preallocated SOE object. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18847/files - new: https://git.openjdk.org/jdk/pull/18847/files/9eaf7956..cb61e8a5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From coleenp at openjdk.org Fri Apr 19 13:09:31 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 13:09:31 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v3] In-Reply-To: <9YgiSsJeMWSEamHCz2NIKviVBytSX71wsvw23R415dQ=.d4739a4c-281e-44af-9904-d84aecd21498@github.com> References: <9YgiSsJeMWSEamHCz2NIKviVBytSX71wsvw23R415dQ=.d4739a4c-281e-44af-9904-d84aecd21498@github.com> Message-ID: <9rj3rf1DuqWiReR4fJYR5SpbGsgAx43ehQZvrSyWMIo=.0e0d1ee5-8982-4c7a-aca2-7b70ad9180ef@github.com> On Fri, 19 Apr 2024 12:35:43 GMT, Doug Simon wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> We don't need to link and initialize VirtualMachineError class because the lines just below it that link and initialize and create an instance of StackOverflowError will do that, since VME is a subclass of SOE. > > src/hotspot/share/classfile/verifier.cpp line 257: > >> 255: // or one of it's superclasses, we're in trouble and are going >> 256: // to infinitely recurse when we try to initialize the exception. >> 257: // So bail out here by throwing the preallocated VM error. > > The comment looks wrong now as I think `THROW_MSG_` creates a new object. You're right - that's a good observation. This code is hard to reach (tried to write a test case yesterday). It might need a class file load hook for the java/lang/Error class, in which case, the whole VM is probably going to be useless. But I will maintain throwing a preallocated instance. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1572341041 From stefank at openjdk.org Fri Apr 19 13:27:57 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 19 Apr 2024 13:27:57 GMT Subject: RFR: 8330585: Refactor/rename forwardee handling In-Reply-To: References: Message-ID: <_NuTTWRldR3kX0d4umnFQCfRwZncFhUDSK-AbY4q1P4=.a31c76bb-2658-4d50-b310-82edcdb2deb6@github.com> On Fri, 19 Apr 2024 12:25:58 GMT, Roman Kennke wrote: > In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. > > This change cleans it up. It introduces a bunch of APIs to facilitate that: > - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). > - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). > > Testing: > - [x] hotspot_gc > - [x] tier1 I agree that we should update the code to use `is_forwarded` and `forwardee`. It makes the code nicer to read. I would like to defer adding the oopDesc member functions in this patch, and bring that in when we integrate the lilliput/self-forwarding changes. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18863#pullrequestreview-2011425934 From jbhateja at openjdk.org Fri Apr 19 13:38:01 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 19 Apr 2024 13:38:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff Hi @asgibbons Please add a new test / extend an existing test for SIGBUS violation testing test/hotspot/jtreg/runtime/Unsafe/InternalErrorTest.java src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2611: > 2609: // Propagate byte to full Register > 2610: __ movzbl(rScratch1, byteVal); > 2611: __ mov64(wide_value, 0x0101010101010101); Long constant should be suffixed by ULL. test/micro/org/openjdk/bench/java/lang/foreign/MemorySegmentZeroUnsafe.java line 1: > 1: package org.openjdk.bench.java.lang.foreign; Copyright header missing. ------------- PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2011247585 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572370327 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572267154 From rkennke at openjdk.org Fri Apr 19 13:55:08 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 19 Apr 2024 13:55:08 GMT Subject: RFR: 8330585: Refactor/rename forwardee handling [v2] In-Reply-To: References: Message-ID: > In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. > > This change cleans it up. It introduces a bunch of APIs to facilitate that: > - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). > - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). > > Testing: > - [x] hotspot_gc > - [x] tier1 Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Don't add API in oopDesc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18863/files - new: https://git.openjdk.org/jdk/pull/18863/files/22b1e990..7223845f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18863&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18863&range=00-01 Stats: 26 lines in 6 files changed: 4 ins; 12 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/18863.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18863/head:pull/18863 PR: https://git.openjdk.org/jdk/pull/18863 From amitkumar at openjdk.org Fri Apr 19 14:02:21 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 19 Apr 2024 14:02:21 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses [v2] In-Reply-To: References: Message-ID: <0kp8UR95O3_KwxqbD7vYBIx0jh38fwb8erA6Mb12wp0=.ff52bc8f-25c5-4c93-9978-a31fdd8acb88@github.com> > It's trivial update to use `testbit` method to test the bit "in-memory" Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: updates comment in sharedRuntime_s390.cpp file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18709/files - new: https://git.openjdk.org/jdk/pull/18709/files/127ea97d..454b08a6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18709&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18709&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18709.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18709/head:pull/18709 PR: https://git.openjdk.org/jdk/pull/18709 From amitkumar at openjdk.org Fri Apr 19 14:02:21 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 19 Apr 2024 14:02:21 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" @TheRealMDoerr would you please provide a second review here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2066649447 From stefank at openjdk.org Fri Apr 19 14:06:56 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 19 Apr 2024 14:06:56 GMT Subject: RFR: 8330585: Refactor/rename forwardee handling [v2] In-Reply-To: References: Message-ID: <553xdfP3susAEkIr9NvEoUPGbg2zoOLhn1cdcg2JCWE=.25fcd38e-ee24-4565-8ff3-7c4363f6e747@github.com> On Fri, 19 Apr 2024 13:55:08 GMT, Roman Kennke wrote: >> In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. >> >> This change cleans it up. It introduces a bunch of APIs to facilitate that: >> - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). >> - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). >> >> Testing: >> - [x] hotspot_gc >> - [x] tier1 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Don't add API in oopDesc Looks good! ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18863#pullrequestreview-2011524775 From ayang at openjdk.org Fri Apr 19 14:06:57 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 19 Apr 2024 14:06:57 GMT Subject: RFR: 8330585: Refactor/rename forwardee handling [v2] In-Reply-To: References: Message-ID: On Fri, 19 Apr 2024 13:55:08 GMT, Roman Kennke wrote: >> In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. >> >> This change cleans it up. It introduces a bunch of APIs to facilitate that: >> - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). >> - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). >> >> Testing: >> - [x] hotspot_gc >> - [x] tier1 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Don't add API in oopDesc Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18863#pullrequestreview-2011532534 From mdoerr at openjdk.org Fri Apr 19 14:23:00 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 19 Apr 2024 14:23:00 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses [v2] In-Reply-To: <0kp8UR95O3_KwxqbD7vYBIx0jh38fwb8erA6Mb12wp0=.ff52bc8f-25c5-4c93-9978-a31fdd8acb88@github.com> References: <0kp8UR95O3_KwxqbD7vYBIx0jh38fwb8erA6Mb12wp0=.ff52bc8f-25c5-4c93-9978-a31fdd8acb88@github.com> Message-ID: On Fri, 19 Apr 2024 14:02:21 GMT, Amit Kumar wrote: >> It's trivial update to use `testbit` method to test the bit "in-memory" > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > updates comment in sharedRuntime_s390.cpp file Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18709#pullrequestreview-2011567955 From sgibbons at openjdk.org Fri Apr 19 14:58:05 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 14:58:05 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 13:25:33 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2611: > >> 2609: // Propagate byte to full Register >> 2610: __ movzbl(rScratch1, byteVal); >> 2611: __ mov64(wide_value, 0x0101010101010101); > > Long constant should be suffixed by ULL. Fixed. > test/micro/org/openjdk/bench/java/lang/foreign/MemorySegmentZeroUnsafe.java line 1: > >> 1: package org.openjdk.bench.java.lang.foreign; > > Copyright header missing. Added ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572502532 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572502204 From heidinga at openjdk.org Fri Apr 19 15:06:04 2024 From: heidinga at openjdk.org (Dan Heidinga) Date: Fri, 19 Apr 2024 15:06:04 GMT Subject: Integrated: 8320522: Remove code related to `RegisterFinalizersAtInit` In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 18:42:29 GMT, Dan Heidinga wrote: > Remove the code related to -XX:[+-]RegisterFinalizersAtInit in JDK23. > > `make test-tier1` passed with this change This pull request has now been integrated. Changeset: 3c1d1d93 Author: Dan Heidinga Committer: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/3c1d1d93d7b1de229753ed697f008bd5639ac957 Stats: 30 lines in 13 files changed: 0 ins; 17 del; 13 mod 8320522: Remove code related to `RegisterFinalizersAtInit` Reviewed-by: coleenp, ayang, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/18823 From mdoerr at openjdk.org Fri Apr 19 15:08:57 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 19 Apr 2024 15:08:57 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 06:08:42 GMT, Amit Kumar wrote: > I have done testing with `DiagnoseSyncOnValueBasedClasses=1`. > > This test failure I got, but it's failing on master branch as well if we are using same argument: > > ``` > STDOUT: > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/amit/mr/jdk/src/hotspot/share/runtime/synchronizer.cpp:485), pid=3628970, tid=3628992 > # fatal error: Synchronizing on object 0x00000000ffe583c8 of klass java.lang.Integer at SplitIfSharedFastLockBehindCastPP.test2(SplitIfSharedFastLockBehindCastPP.java:92) > # > # JRE version: OpenJDK Runtime Environment (23.0) (fastdebug build 23-internal-adhoc.amit.jdk) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.amit.jdk, mixed mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-s390x) > # Problematic frame: > # V [libjvm.so+0x12b4b8c] ObjectSynchronizer::handle_sync_on_value_based_class(Handle, JavaThread*)+0x944 > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/amit/mr/jdk/build/linux-s390x-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_loopopts_SplitIfSharedFastLockBehindCastPP_java/scratch/0/core.3628970) > # > # An error report file with more information is saved as: > # /home/amit/mr/jdk/build/linux-s390x-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_compiler_loopopts_SplitIfSharedFastLockBehindCastPP_java/scratch/0/hs_err_pid3628970.log > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # > ``` That should get investigated in a new JBS issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2066772595 From amitkumar at openjdk.org Fri Apr 19 15:22:58 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 19 Apr 2024 15:22:58 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses [v2] In-Reply-To: <0kp8UR95O3_KwxqbD7vYBIx0jh38fwb8erA6Mb12wp0=.ff52bc8f-25c5-4c93-9978-a31fdd8acb88@github.com> References: <0kp8UR95O3_KwxqbD7vYBIx0jh38fwb8erA6Mb12wp0=.ff52bc8f-25c5-4c93-9978-a31fdd8acb88@github.com> Message-ID: On Fri, 19 Apr 2024 14:02:21 GMT, Amit Kumar wrote: >> It's trivial update to use `testbit` method to test the bit "in-memory" > > Amit Kumar has updated the pull request incrementally with one additional commit since the last revision: > > updates comment in sharedRuntime_s390.cpp file I have opened issue here: https://bugs.openjdk.org/browse/JDK-8330696 Should I integrate this one ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2066797677 From coleenp at openjdk.org Fri Apr 19 15:30:27 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 19 Apr 2024 15:30:27 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: References: Message-ID: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into vme - Throw preallocated SOE object. - We don't need to link and initialize VirtualMachineError class because the lines just below it that link and initialize and create an instance of StackOverflowError will do that, since VME is a subclass of SOE. - Remove newline - 8330578: The VM creates instance of abstract class VirtualMachineError ------------- Changes: https://git.openjdk.org/jdk/pull/18847/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=04 Stats: 29 lines in 9 files changed: 1 ins; 17 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From lucy at openjdk.org Fri Apr 19 16:01:57 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 19 Apr 2024 16:01:57 GMT Subject: RFR: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses [v2] In-Reply-To: References: <0kp8UR95O3_KwxqbD7vYBIx0jh38fwb8erA6Mb12wp0=.ff52bc8f-25c5-4c93-9978-a31fdd8acb88@github.com> Message-ID: <7bO1cmmnd4Quy1Yv1bGo_WplFHzL0qEaFXGH_Ha_22w=.b3ce2cc7-e1c5-43b0-9d26-206215a1268e@github.com> On Fri, 19 Apr 2024 15:20:21 GMT, Amit Kumar wrote: > Should I integrate this one ? Go! The observed crash is obviously unrelated to your change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18709#issuecomment-2066863405 From iveresov at openjdk.org Fri Apr 19 16:06:02 2024 From: iveresov at openjdk.org (Igor Veresov) Date: Fri, 19 Apr 2024 16:06:02 GMT Subject: RFR: 8329433: Reduce nmethod header size [v8] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 00:41:03 GMT, Vladimir Kozlov wrote: >> This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. >> These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: >> >> Statistics for 1282 bytecoded nmethods for C2: >> total in heap = 5560352 (100%) >> header = 389728 (7.009053%) >> >> vs >> >> Statistics for 1322 bytecoded nmethods for C2: >> total in heap = 8307120 (100%) >> header = 327856 (3.946687%) >> >> >> Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. >> >> I did additional cleanup after recent `CompiledMethod` removal. >> >> Tested tier1-7,stress,xcomp and performance testing. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comment Looks good. ------------- Marked as reviewed by iveresov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18768#pullrequestreview-2011831284 From amitkumar at openjdk.org Fri Apr 19 16:10:01 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 19 Apr 2024 16:10:01 GMT Subject: Integrated: 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses In-Reply-To: References: Message-ID: <0Y7huKabKd-g8El0pp1ftwXpA_3J85seQ79MVJyf15o=.1c41975f-3395-430e-9fe0-3f8ae868d756@github.com> On Wed, 10 Apr 2024 09:58:55 GMT, Amit Kumar wrote: > It's trivial update to use `testbit` method to test the bit "in-memory" This pull request has now been integrated. Changeset: 8da175d0 Author: Amit Kumar URL: https://git.openjdk.org/jdk/commit/8da175d094c02e7655188a60e6364104433429de Stats: 8 lines in 2 files changed: 0 ins; 4 del; 4 mod 8330008: [s390x] Test bit "in-memory" in case of DiagnoseSyncOnValueBasedClasses Reviewed-by: lucy, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/18709 From kvn at openjdk.org Fri Apr 19 16:12:07 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 16:12:07 GMT Subject: RFR: 8329433: Reduce nmethod header size [v8] In-Reply-To: <1HEEQR0o6XO22qTFgrqriq89NEmYyxw3khht6PWlu8U=.245c3378-cdcd-4085-a307-aab5186e6d6d@github.com> References: <1HEEQR0o6XO22qTFgrqriq89NEmYyxw3khht6PWlu8U=.245c3378-cdcd-4085-a307-aab5186e6d6d@github.com> Message-ID: On Thu, 18 Apr 2024 01:14:34 GMT, Dean Long wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Address comment > > Marked as reviewed by dlong (Reviewer). Thank you @dean-long, @coleenp, @JohnTortugo and @veresov for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18768#issuecomment-2066880731 From kvn at openjdk.org Fri Apr 19 16:15:04 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 16:15:04 GMT Subject: Integrated: 8329433: Reduce nmethod header size In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 22:43:15 GMT, Vladimir Kozlov wrote: > This is part of changes which try to reduce size of `nmethod` and `codeblob` data vs code in CodeCache. > These changes reduced size of `nmethod` header from 288 to 232 bytes. From 304 to 248 in optimized VM: > > Statistics for 1282 bytecoded nmethods for C2: > total in heap = 5560352 (100%) > header = 389728 (7.009053%) > > vs > > Statistics for 1322 bytecoded nmethods for C2: > total in heap = 8307120 (100%) > header = 327856 (3.946687%) > > > Several unneeded fields in `nmethod` and `CodeBlob` were removed. Some fields were changed from `int` to `int16_t` with added corresponding asserts to make sure their values are fit into 16 bits. > > I did additional cleanup after recent `CompiledMethod` removal. > > Tested tier1-7,stress,xcomp and performance testing. This pull request has now been integrated. Changeset: b704e912 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/b704e91241b0f84d866f50a8f2c6af240087cb29 Stats: 523 lines in 15 files changed: 135 ins; 186 del; 202 mod 8329433: Reduce nmethod header size Reviewed-by: dlong, iveresov ------------- PR: https://git.openjdk.org/jdk/pull/18768 From sgibbons at openjdk.org Fri Apr 19 16:25:28 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 16:25:28 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v22] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Address review comments; update copyright years ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/7a1d67e5..dccf6b6c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=20-21 Stats: 37 lines in 13 files changed: 23 ins; 0 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From aph at openjdk.org Fri Apr 19 17:28:58 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 19 Apr 2024 17:28:58 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: <1-OXa3gSx4prs42UACREP9kbGb0Q_N2_2nvcg_cCCUw=.1834333c-e378-46e8-97d1-ec9a108a64af@github.com> On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... > > > I can re-check and post the performance numbers here per a request. > > > Please do. Please also post the code. > > @theRealAph , you may find the performance numbers and the code in [mikabl-arm at f844b11](https://github.com/mikabl-arm/jdk/commit/f844b116f1a01653f127238d3a258cd2da4e1aca) OK, thanks. I think I see the problem. Unfortunately I've come to the end of my working day, but I'll try to get back to you as soon as possible next week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2066994696 From jvernee at openjdk.org Fri Apr 19 17:50:01 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 19 Apr 2024 17:50:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v22] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <1e63Ivvo2ZkyuP3U-RHrnZaUxv1PiKa2UnR5b2H9vpc=.290efaf8-6067-4e92-b7ae-932f6284b4cb@github.com> On Fri, 19 Apr 2024 16:25:28 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments; update copyright years I'm not really qualified as a compiler code reviewer, but I've left some comments to try and help this along. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2523: > 2521: // Number of (8*X)-byte chunks into rScratch1 > 2522: __ movq(tmp, size); > 2523: __ shrq(tmp, 3); `shr` [sets the zero flag][1], so I think you can just move the jump to after the shift and avoid a separate comparison? ```suggestion // Number of (8*X)-byte chunks into rScratch1 __ movq(tmp, size); __ shrq(tmp, 3); __ jccb(Assembler::zero, L_Tail); [1]: https://www.felixcloutier.com/x86/sal:sar:shl:shr#flags-affected ------------- PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2011751831 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572712233 From jvernee at openjdk.org Fri Apr 19 17:50:06 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 19 Apr 2024 17:50:06 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: > 4011: // Initialize table for unsafe copy memeory check. > 4012: if (UnsafeMemoryAccess::_table == nullptr) { > 4013: UnsafeMemoryAccess::create_table(26); How did you arrive at a table size of 26? src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603: > 2601: const Register wide_value = rax; > 2602: const Register rScratch1 = r10; > 2603: Maybe put an `assert_different_registers` here for the above registers, just to be sure. (I see you are avoiding the existing `rscratch1` already, because of a conflict with `c_rarg2`) src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674: > 2672: // Parameter order is (ptr, byteVal, size) > 2673: __ xchgq(c_rarg1, c_rarg2); > 2674: __ pop(rbp); // Clear effect of enter() Why not just use `leave()` here? src/hotspot/share/opto/library_call.cpp line 4959: > 4957: if (stopped()) return true; > 4958: > 4959: if (StubRoutines::unsafe_setmemory() == nullptr) return false; I don't see why this check is needed here, since we already check whether the stub is there in `is_intrinsic_supported`. Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it would be good to keep consistency between the two. src/hotspot/share/opto/runtime.cpp line 780: > 778: const TypeFunc* OptoRuntime::make_setmemory_Type() { > 779: // create input type (domain) > 780: int num_args = 4; This variable seems redundant. src/hotspot/share/opto/runtime.cpp line 786: > 784: fields[argp++] = TypePtr::NOTNULL; // dest > 785: fields[argp++] = TypeLong::LONG; // size > 786: fields[argp++] = Type::HALF; // size Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 (Note that you will also have to adjust `argcnt` for this) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572570842 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572578437 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572593795 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572556648 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572564382 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572562058 From kvn at openjdk.org Fri Apr 19 18:17:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 18:17:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <6BzGvaMr42tgUlEeHinsh7jGrvjBMIuNFijfMWhOSI0=.c65b5638-e247-4b09-9b63-1bf377668947@github.com> On Fri, 19 Apr 2024 15:43:17 GMT, Jorn Vernee wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: > >> 4011: // Initialize table for unsafe copy memeory check. >> 4012: if (UnsafeMemoryAccess::_table == nullptr) { >> 4013: UnsafeMemoryAccess::create_table(26); > > How did you arrive at a table size of 26? This needs comment ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572744077 From kvn at openjdk.org Fri Apr 19 18:28:01 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 18:28:01 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v22] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 16:25:28 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments; update copyright years General comment/suggestion before I dive into review. Can we do renaming `UnsafeCopyMemory*` -> `UnsafeMemory*` in follow up RFE. This change hides the real change. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8336: > 8334: // Initialize table for copy memory (arraycopy) check. > 8335: if (UnsafeMemoryAccess::_table == nullptr) { > 8336: UnsafeMemoryAccess::create_table(18); Needs comment explaining 18 number src/hotspot/share/utilities/copy.hpp line 303: > 301: inline static void shared_disjoint_words_atomic(const HeapWord* from, > 302: HeapWord* to, size_t count) { > 303: I don't think this justify to change the file. ------------- Changes requested by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2012077574 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572750449 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572746349 From gziemski at openjdk.org Fri Apr 19 18:53:08 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Fri, 19 Apr 2024 18:53:08 GMT Subject: RFR: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE Message-ID: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> This is a 3rd attempt of the same fix: 1st one had to be pulled out because of a bug in zsh 2nd one had a workaround for the bug in zsh, but then uncovered an issue in JWDP (JDK-8324668), which was subsequently fixed. Tested with MACH5 tier1-9 with no unique or new failures on macOS ------------- Commit messages: - Merge remote-tracking branch 'upstream/master' into JDK-8324577 - use higher RLIMIT_NOFILE limit on macosx Changes: https://git.openjdk.org/jdk/pull/18821/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18821&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8324577 Stats: 18 lines in 1 file changed: 9 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18821.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18821/head:pull/18821 PR: https://git.openjdk.org/jdk/pull/18821 From sgibbons at openjdk.org Fri Apr 19 20:13:03 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 20:13:03 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v23] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/dccf6b6c..dd0094ea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=21-22 Stats: 175 lines in 21 files changed: 4 ins; 5 del; 166 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Fri Apr 19 20:13:05 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 20:13:05 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v22] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 18:25:17 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments; update copyright years > > General comment/suggestion before I dive into review. > Can we do renaming `UnsafeCopyMemory*` -> `UnsafeMemory*` in follow up RFE. This change hides the real change. @vnkozlov I un-did the name change and will submit a separate request for re-naming. Thanks. > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8336: > >> 8334: // Initialize table for copy memory (arraycopy) check. >> 8335: if (UnsafeMemoryAccess::_table == nullptr) { >> 8336: UnsafeMemoryAccess::create_table(18); > > Needs comment explaining 18 number Hmmm... There was no comment explaining the 8 number :-). I added 10 to the table size because I knew I was going to add 7 places where a mark was required. I left 3 for safety. The algorithm has since changed, so I changed this to: `UnsafeCopyMemory::create_table(8 + 4); // 8 for copyMemory; 4 for setMemory` I did a similar change to all other table creation numbers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067197605 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572840222 From sgibbons at openjdk.org Fri Apr 19 20:13:05 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 20:13:05 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: <6BzGvaMr42tgUlEeHinsh7jGrvjBMIuNFijfMWhOSI0=.c65b5638-e247-4b09-9b63-1bf377668947@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <6BzGvaMr42tgUlEeHinsh7jGrvjBMIuNFijfMWhOSI0=.c65b5638-e247-4b09-9b63-1bf377668947@github.com> Message-ID: On Fri, 19 Apr 2024 18:14:05 GMT, Vladimir Kozlov wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: >> >>> 4011: // Initialize table for unsafe copy memeory check. >>> 4012: if (UnsafeMemoryAccess::_table == nullptr) { >>> 4013: UnsafeMemoryAccess::create_table(26); >> >> How did you arrive at a table size of 26? > > This needs comment I added 10 to the table size because I knew I was going to add 7 places where a mark was required for setMemory. I left 3 for safety. The algorithm changed so only 4 are needed. The algorithm has since changed, so I changed this to: `UnsafeCopyMemory::create_table(16 + 4); // 16 for copyMemory; 4 for setMemory` I did a similar change to all other table creation numbers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572841521 From sgibbons at openjdk.org Fri Apr 19 20:13:06 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 20:13:06 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v22] In-Reply-To: <1e63Ivvo2ZkyuP3U-RHrnZaUxv1PiKa2UnR5b2H9vpc=.290efaf8-6067-4e92-b7ae-932f6284b4cb@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <1e63Ivvo2ZkyuP3U-RHrnZaUxv1PiKa2UnR5b2H9vpc=.290efaf8-6067-4e92-b7ae-932f6284b4cb@github.com> Message-ID: On Fri, 19 Apr 2024 17:42:36 GMT, Jorn Vernee wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments; update copyright years > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2523: > >> 2521: // Number of (8*X)-byte chunks into rScratch1 >> 2522: __ movq(tmp, size); >> 2523: __ shrq(tmp, 3); > > `shr` [sets the zero flag][1], so I think you can just move the jump to after the shift and avoid a separate comparison > > ```suggestion > // Number of (8*X)-byte chunks into rScratch1 > __ movq(tmp, size); > __ shrq(tmp, 3); > __ jccb(Assembler::zero, L_Tail); > > > [1]: https://www.felixcloutier.com/x86/sal:sar:shl:shr#flags-affected Good catch. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572794443 From sgibbons at openjdk.org Fri Apr 19 20:13:07 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 20:13:07 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 15:50:05 GMT, Jorn Vernee wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603: > >> 2601: const Register wide_value = rax; >> 2602: const Register rScratch1 = r10; >> 2603: > > Maybe put an `assert_different_registers` here for the above registers, just to be sure. (I see you are avoiding the existing `rscratch1` already, because of a conflict with `c_rarg2`) Done. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674: > >> 2672: // Parameter order is (ptr, byteVal, size) >> 2673: __ xchgq(c_rarg1, c_rarg2); >> 2674: __ pop(rbp); // Clear effect of enter() > > Why not just use `leave()` here? No special reason. I've changed it since it seems to provide more clarity. > src/hotspot/share/opto/library_call.cpp line 4959: > >> 4957: if (stopped()) return true; >> 4958: >> 4959: if (StubRoutines::unsafe_setmemory() == nullptr) return false; > > I don't see why this check is needed here, since we already check whether the stub is there in `is_intrinsic_supported`. > > Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it would be good to keep consistency between the two. Removed. > src/hotspot/share/opto/runtime.cpp line 780: > >> 778: const TypeFunc* OptoRuntime::make_setmemory_Type() { >> 779: // create input type (domain) >> 780: int num_args = 4; > > This variable seems redundant. It is. It is there due to copy/paste from the other 10 places that also have the same redundant variable declaration. I've removed it from here, but I think I'll be asked to submit a separate PR if I remove it from the other locations. Note that it's also redundant in `make_arraycopy_Type()`. > src/hotspot/share/opto/runtime.cpp line 786: > >> 784: fields[argp++] = TypePtr::NOTNULL; // dest >> 785: fields[argp++] = TypeLong::LONG; // size >> 786: fields[argp++] = Type::HALF; // size > > Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 > > (Note that you will also have to adjust `argcnt` for this) I don't understand this well enough to be confident in the change. Can you please verify that I've changed it properly? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572797332 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572800059 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572804660 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572815040 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572823468 From sgibbons at openjdk.org Fri Apr 19 20:13:08 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 20:13:08 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <2jYlnjJmp3oI89OC8iPF3UIFDeabAOAD51VhipzQDE8=.7d126e34-0c72-4e3e-8de4-957cd7f8dc8b@github.com> On Fri, 19 Apr 2024 18:16:33 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Add enter() and leave(); remove Windows-specific register stuff > > src/hotspot/share/utilities/copy.hpp line 303: > >> 301: inline static void shared_disjoint_words_atomic(const HeapWord* from, >> 302: HeapWord* to, size_t count) { >> 303: switch (count) { > > I don't think this justify to change the file. Reverted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572824249 From jvernee at openjdk.org Fri Apr 19 20:21:32 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 19 Apr 2024 20:21:32 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v21] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 19:18:13 GMT, Scott Gibbons wrote: >> src/hotspot/share/opto/runtime.cpp line 786: >> >>> 784: fields[argp++] = TypePtr::NOTNULL; // dest >>> 785: fields[argp++] = TypeLong::LONG; // size >>> 786: fields[argp++] = Type::HALF; // size >> >> Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 >> >> (Note that you will also have to adjust `argcnt` for this) > > I don't understand this well enough to be confident in the change. Can you please verify that I've changed it properly? Your latest version looks good to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572909435 From kvn at openjdk.org Fri Apr 19 21:11:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 21:11:00 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v23] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 20:13:03 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Review comments This looks good. I only have question about long vs short jumps in stub's code. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2550: > 2548: > 2549: // If zero, then we're done > 2550: __ jccb(Assembler::zero, L_exit); Code in `generate_unsafe_setmemory()` uses long jumps to `L_exit` but here you use short. Why? src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2638: > 2636: L_exit, _masm); > 2637: } > 2638: __ jmp(L_exit); Here is long jump to `L_exit` after `do_setmemory_atomic_loop()` call. Should this be also short jump? src/hotspot/share/opto/runtime.cpp line 785: > 783: fields[argp++] = TypePtr::NOTNULL; // dest > 784: fields[argp++] = TypeX_X; // size > 785: LP64_ONLY(fields[argp++] = Type::HALF); // size Nit: align `/` src/hotspot/share/utilities/copy.hpp line 2: > 1: /* > 2: * Copyright (c) 2003, 2024, Oracle and/or its affiliates. All rights reserved. You forgot to undo year change in this file. ------------- PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2012400269 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572947954 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572948693 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572955327 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572960023 From kvn at openjdk.org Fri Apr 19 21:11:00 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 21:11:00 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v23] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <5qkCM1RfvInEvp3ipImOqWXV7Cdg97BUCApATuR2KnI=.30f00efc-d8cd-4abe-9107-bdfa84df9165@github.com> On Fri, 19 Apr 2024 20:54:32 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2638: > >> 2636: L_exit, _masm); >> 2637: } >> 2638: __ jmp(L_exit); > > Here is long jump to `L_exit` after `do_setmemory_atomic_loop()` call. Should this be also short jump? Do we have additional code in debug VM wihch increase distance and requires long jump? I don't see it. Usually it something which call `__ STOP()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572951726 From sgibbons at openjdk.org Fri Apr 19 22:08:52 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 22:08:52 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Long to short jmp; other cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/dd0094ea..19616244 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=22-23 Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Fri Apr 19 22:08:53 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 22:08:53 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v23] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 20:53:31 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2550: > >> 2548: >> 2549: // If zero, then we're done >> 2550: __ jccb(Assembler::zero, L_exit); > > Code in `generate_unsafe_setmemory()` uses long jumps to `L_exit` but here you use short. Why? Ah - the original code (3 iterations ago) was about 10 bytes too long for a short jump. It's short enough now. Changed. > src/hotspot/share/opto/runtime.cpp line 785: > >> 783: fields[argp++] = TypePtr::NOTNULL; // dest >> 784: fields[argp++] = TypeX_X; // size >> 785: LP64_ONLY(fields[argp++] = Type::HALF); // size > > Nit: align `/` Done > src/hotspot/share/utilities/copy.hpp line 2: > >> 1: /* >> 2: * Copyright (c) 2003, 2024, Oracle and/or its affiliates. All rights reserved. > > You forgot to undo year change in this file. Yup. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573006432 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573014982 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573015145 From sgibbons at openjdk.org Fri Apr 19 22:08:53 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 19 Apr 2024 22:08:53 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v23] In-Reply-To: <5qkCM1RfvInEvp3ipImOqWXV7Cdg97BUCApATuR2KnI=.30f00efc-d8cd-4abe-9107-bdfa84df9165@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <5qkCM1RfvInEvp3ipImOqWXV7Cdg97BUCApATuR2KnI=.30f00efc-d8cd-4abe-9107-bdfa84df9165@github.com> Message-ID: On Fri, 19 Apr 2024 20:58:43 GMT, Vladimir Kozlov wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2638: >> >>> 2636: L_exit, _masm); >>> 2637: } >>> 2638: __ jmp(L_exit); >> >> Here is long jump to `L_exit` after `do_setmemory_atomic_loop()` call. Should this be also short jump? > > Do we have additional code in debug VM wihch increase distance and requires long jump? I don't see it. Usually it something which call `__ STOP()`. The old code required a long jump due to the size of `do_setmemory_atomic_loop` but has since been refactored. The `jmp(Label)` code will generate a short jump provided the label has been defined and is in range. Otherwise a long jump is generated. Changed to `jmpb` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573012933 From kvn at openjdk.org Fri Apr 19 22:14:31 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 19 Apr 2024 22:14:31 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <7KJIFS8Y1SqIbr847g66L6inpqMEyKXA6mIlrmrsG6o=.071b82ef-a248-41f4-a36c-e7e5ae28dacb@github.com> On Fri, 19 Apr 2024 22:08:52 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Long to short jmp; other cleanup Good. I will submit our testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067343940 From duke at openjdk.org Sat Apr 20 02:13:51 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 20 Apr 2024 02:13:51 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' Message-ID: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> follow up 8267941 ------------- Commit messages: - rename Changes: https://git.openjdk.org/jdk/pull/18871/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330694 Stats: 978 lines in 124 files changed: 0 ins; 0 del; 978 mod Patch: https://git.openjdk.org/jdk/pull/18871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18871/head:pull/18871 PR: https://git.openjdk.org/jdk/pull/18871 From cjplummer at openjdk.org Sat Apr 20 03:15:37 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Sat, 20 Apr 2024 03:15:37 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' In-Reply-To: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: On Sat, 20 Apr 2024 02:04:20 GMT, Lei Zaakjyu wrote: > follow up 8267941 Functionally the SA changes look good, but I pointed out a few other places that you might also want to consider doing renames for. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerFinder.java line 123: > 121: if (heap instanceof G1CollectedHeap) { > 122: G1CollectedHeap g1 = (G1CollectedHeap)heap; > 123: loc.hr = g1.heapRegionForAddress(a); "g1HeapRegionForAddress"? src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerFinder.java line 126: > 124: // We don't assert that loc.hr is not null like we do for the SerialHeap. This is > 125: // because heap.isIn(a) can return true if the address is anywhere in G1's mapped > 126: // memory, even if that area of memory is not in use by a G1 G1HeapRegion. So there Suggestion: // memory, even if that area of memory is not in use by a G1HeapRegion. So there src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerLocation.java line 131: > 129: } > 130: > 131: public G1HeapRegion getHeapRegion() { Do we want to rename to getG1HeapRegion? test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java line 62: > 60: agent.attach(Integer.parseInt(pid)); > 61: G1CollectedHeap heap = (G1CollectedHeap)VM.getVM().getUniverse().heap(); > 62: G1HeapRegion hr = heap.hrm().heapRegionIterator().next(); "g1HeapRegionIterator"? ------------- PR Review: https://git.openjdk.org/jdk/pull/18871#pullrequestreview-2013005488 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573125957 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573124198 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573124436 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573125643 From iklam at openjdk.org Sat Apr 20 03:54:34 2024 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 20 Apr 2024 03:54:34 GMT Subject: RFR: 8330540: Rename the enum type CompileCommand to CompileCommandEnum In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 21:49:34 GMT, Dean Long wrote: >> `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). >> >> This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: >> >> This simple c++ file illustrates the problem: >> >> enum class CompileCommand { a, b, c }; >> void foo(CompileCommand x) {} >> char* CompileCommand; // can no longer use "CompileCommand" as a type >> void good(enum CompileCommand x) {} >> void bad(CompileCommand x) {} >> >> $ g++ -c ~/enum.cpp >> /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void >> 5 | void bad(CompileCommand x) {} >> >> >> The fix is to rename the enum type to `CompileCommandEnum`. >> >> This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. > > The char* CompileCommand for the flag seems to be used much less than the enum (I could only find it used once). So if we could rename that variable, that would be a simpler solution. But it looks like that would require some advanced macro tricks in DECLARE_FLAGS to allow customization on a per-flag basis. Thanks @dean-long and @vnkozlov for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/18829#issuecomment-2067537671 From iklam at openjdk.org Sat Apr 20 03:54:35 2024 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 20 Apr 2024 03:54:35 GMT Subject: Integrated: 8330540: Rename the enum type CompileCommand to CompileCommandEnum In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 21:26:06 GMT, Ioi Lam wrote: > `CompileCommand` is used both as a enum type ([compilerOracle.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.hpp#L104)), and a global variable ([compiler_globals.hpp](https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compiler_globals.hpp#L304)). > > This makes very awkward to the enum type -- we are forced to use `enum CompileCommand` in the source code whenever a type is needed: > > This simple c++ file illustrates the problem: > > enum class CompileCommand { a, b, c }; > void foo(CompileCommand x) {} > char* CompileCommand; // can no longer use "CompileCommand" as a type > void good(enum CompileCommand x) {} > void bad(CompileCommand x) {} > > $ g++ -c ~/enum.cpp > /home/iklam/enum.cpp:5:6: error: variable or field ?bad? declared void > 5 | void bad(CompileCommand x) {} > > > The fix is to rename the enum type to `CompileCommandEnum`. > > This also makes it possible to forward-declare `CompileCommandEnum` (see vmEnum.hpp) without including compilerOracle.hpp. This improves HotSpot build time by reducing the number of .o files that include compilerOracle.hpp from 456 to 16. This pull request has now been integrated. Changeset: 6d569961 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/6d5699617ff0985104a8bb5f2c9eb8887cb0961e Stats: 138 lines in 19 files changed: 11 ins; 4 del; 123 mod 8330540: Rename the enum type CompileCommand to CompileCommandEnum Reviewed-by: kvn, dlong ------------- PR: https://git.openjdk.org/jdk/pull/18829 From kbarrett at openjdk.org Sat Apr 20 04:21:28 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 20 Apr 2024 04:21:28 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' In-Reply-To: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: On Sat, 20 Apr 2024 02:04:20 GMT, Lei Zaakjyu wrote: > follow up 8267941 So much scrolling :) Looks good. Just a few very minor nits for which I don't need to re-review. src/hotspot/share/cds/archiveHeapWriter.cpp line 90: > 88: > 89: guarantee(UseG1GC, "implementation limitation"); > 90: guarantee(MIN_GC_REGION_ALIGNMENT <= /*G1*/G1HeapRegion::min_region_size_in_words() * HeapWordSize, "must be"); The "/*G1*/" comment should be removed. src/hotspot/share/gc/g1/g1ConcurrentMark.hpp line 761: > 759: > 760: // Region this task is scanning, null if we're not scanning any > 761: G1HeapRegion* _curr_region; Adjust indentation of member name to (re)match those nearby. src/hotspot/share/gc/g1/g1ConcurrentMarkBitMap.hpp line 39: > 37: class G1CMTask; > 38: class G1ConcurrentMark; > 39: class G1HeapRegion; With this forward declaration rename being the only change, I wonder if the declaration is even needed. Try deleting it, but keep if removing produces non-trivial effects elsewhere. src/hotspot/share/gc/g1/g1FullGCCompactionPoint.hpp line 40: > 38: G1FullCollector* _collector; > 39: G1HeapRegion* _current_region; > 40: HeapWord* _compaction_top; Tidy indentation of `_compaction_top`. My preference would be to remove extra whitespace before it, rather than adding to (re)line up with new position of `_current_region`. src/hotspot/share/gc/g1/g1OopClosures.hpp line 33: > 31: #include "oops/markWord.hpp" > 32: > 33: class G1HeapRegion; With no other changes in this file, maybe this forward declaration isn't needed at all? But keep for now if removal leads to non-trivial additional changes. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18871#pullrequestreview-2013013903 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573137157 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573138750 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573139162 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573140388 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1573141541 From kvn at openjdk.org Sat Apr 20 04:31:34 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 20 Apr 2024 04:31:34 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <0cg24YXFi4foGH_uKTY6JmABMhzjMH6gmH78iE0CC4w=.a52937a2-d728-4616-b158-a2a338cbb6f4@github.com> On Fri, 19 Apr 2024 22:08:52 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Long to short jmp; other cleanup `runtime/Unsafe/InternalErrorTest.java` test SIGBUS when run with `-Xcomp` (and other flags in test's @run command): # SIGBUS (0xa) at pc=0x0000000119514760, pid=63021, tid=28163 # # JRE version: Java(TM) SE Runtime Environment (23.0) (fastdebug build 23-internal-2024-04-19-2326152.vladimir.kozlov.jdkgit2) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 23-internal-2024-04-19-2326152.vladimir.kozlov.jdkgit2, compiled mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64) # Problematic frame: # v ~StubRoutines::jbyte_fill 0x0000000119514760 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067547078 From duke at openjdk.org Sat Apr 20 06:21:41 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 20 Apr 2024 06:21:41 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v2] In-Reply-To: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: > follow up 8267941 Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: tidy up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18871/files - new: https://git.openjdk.org/jdk/pull/18871/files/4fe5badc..cae3efd0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=00-01 Stats: 5 lines in 5 files changed: 0 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18871/head:pull/18871 PR: https://git.openjdk.org/jdk/pull/18871 From duke at openjdk.org Sat Apr 20 08:44:45 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 20 Apr 2024 08:44:45 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v3] In-Reply-To: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: > follow up 8267941 Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: also tidy up ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18871/files - new: https://git.openjdk.org/jdk/pull/18871/files/cae3efd0..f02334fd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18871/head:pull/18871 PR: https://git.openjdk.org/jdk/pull/18871 From jbhateja at openjdk.org Sat Apr 20 14:25:33 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 20 Apr 2024 14:25:33 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 22:08:52 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Long to short jmp; other cleanup src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2530: > 2528: switch (type) { > 2529: case USM_SHORT: > 2530: __ movw(Address(dest, (2 * i)), wide_value); MOVW emits an extra Operand Size Override prefix byte compared to 32 and 64 bit stores, any specific reason for keeping same unroll factor for all the stores. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2539: > 2537: break; > 2538: } > 2539: } I understand we want to be as accurate as possible in filling the tail in an event of SIGBUS, but we are anyways creating a wide value for 8 packed bytes if destination segment was quadword aligned, aligned quadword stores are implicitly atomic on x86 targets, what's your thoughts on using a vector instruction based loop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573297441 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573299069 From dcubed at openjdk.org Sat Apr 20 14:43:33 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Sat, 20 Apr 2024 14:43:33 GMT Subject: RFR: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE In-Reply-To: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> References: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> Message-ID: On Wed, 17 Apr 2024 16:49:25 GMT, Gerard Ziemski wrote: > This is a 3rd attempt of the same fix: > > 1st one had to be pulled out because of a bug in zsh > 2nd one had a workaround for the bug in zsh, but then uncovered an issue in JWDP (JDK-8324668), which was subsequently fixed. > > Tested with MACH5 tier1-9 with no unique or new failures on macOS I compared this patch to the previous patch (8300088) and it is the same in the core part of the fix. All but one of the editorial changes from 8300088 have been dropped which is good for a backport. It would be good if you revived all of the editorial fixes from 8300088 and integrated them into the main line using a separate RFE. Thanks for documenting your testing. src/hotspot/os/bsd/os_bsd.cpp line 2136: > 2134: > 2135: if (MaxFDLimit) { > 2136: // Set the number of file descriptors to max. print out error You dropped the other editorial fixed, but only kept part of this one. In the previous version of patch (8300088), you also fixed: s/print/Print/ I would drop this editorial fix also. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18821#pullrequestreview-2013099071 PR Review Comment: https://git.openjdk.org/jdk/pull/18821#discussion_r1573305188 From sgibbons at openjdk.org Sat Apr 20 19:09:43 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 20 Apr 2024 19:09:43 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v25] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: <3Z1vY5KHl-D5I1VoQYb6w0B1QToR0cVOnOov_vfrAe0=.d7e4944e-b781-477a-862b-dc067fab9d13@github.com> > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix UnsafeCopyMemoryMark scope issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18555/files - new: https://git.openjdk.org/jdk/pull/18555/files/19616244..c1290169 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=23-24 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Sat Apr 20 19:09:44 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 20 Apr 2024 19:09:44 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 19 Apr 2024 22:08:52 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Long to short jmp; other cleanup The SIGBUS was due to improper scoping of the UnsafeCopyMemoryMark. The change is: ` {` ` // Add set memory mark to protect against unsafe accesses faulting` `- UnsafeCopyMemoryMark(this, ((t == T_BYTE) && !aligned), true);` `+ UnsafeCopyMemoryMark usmm(this, ((t == T_BYTE) && !aligned), true);` ` __ generate_fill(t, aligned, to, value, r11, rax, xmm0);` ` }` ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067758164 From sgibbons at openjdk.org Sat Apr 20 19:09:44 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 20 Apr 2024 19:09:44 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Sat, 20 Apr 2024 14:14:59 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Long to short jmp; other cleanup > > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2530: > >> 2528: switch (type) { >> 2529: case USM_SHORT: >> 2530: __ movw(Address(dest, (2 * i)), wide_value); > > MOVW emits an extra Operand Size Override prefix byte compared to 32 and 64 bit stores, any specific reason for keeping same unroll factor for all the stores. My understanding is the spec requires the appropriate-sized write based on alignment and size. This is why there's no 128-bit or 256-bit store loops. > src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2539: > >> 2537: break; >> 2538: } >> 2539: } > > I understand we want to be as accurate as possible in filling the tail in an event of SIGBUS, but we are anyways creating a wide value for 8 packed bytes if destination segment was quadword aligned, aligned quadword stores are implicitly atomic on x86 targets, what's your thoughts on using a vector instruction based loop. I believe the spec is specific on the size of the store required given alignment and size. I want to honor that spec even though wider stores could be done in many cases. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573373720 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1573374108 From sgibbons at openjdk.org Sat Apr 20 19:14:32 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 20 Apr 2024 19:14:32 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v24] In-Reply-To: <0cg24YXFi4foGH_uKTY6JmABMhzjMH6gmH78iE0CC4w=.a52937a2-d728-4616-b158-a2a338cbb6f4@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <0cg24YXFi4foGH_uKTY6JmABMhzjMH6gmH78iE0CC4w=.a52937a2-d728-4616-b158-a2a338cbb6f4@github.com> Message-ID: On Sat, 20 Apr 2024 04:28:43 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Long to short jmp; other cleanup > > `runtime/Unsafe/InternalErrorTest.java` test SIGBUS when run with `-Xcomp` (and other flags in test's @run command): > > # SIGBUS (0xa) at pc=0x0000000119514760, pid=63021, tid=28163 > # > # JRE version: Java(TM) SE Runtime Environment (23.0) (fastdebug build 23-internal-2024-04-19-2326152.vladimir.kozlov.jdkgit2) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 23-internal-2024-04-19-2326152.vladimir.kozlov.jdkgit2, compiled mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64) > # Problematic frame: > # v ~StubRoutines::jbyte_fill 0x0000000119514760 @vnkozlov Thanks for the feedback. Can you please start the testing again? I'd appreciate it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067759300 From kvn at openjdk.org Sat Apr 20 20:48:31 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 20 Apr 2024 20:48:31 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v25] In-Reply-To: <3Z1vY5KHl-D5I1VoQYb6w0B1QToR0cVOnOov_vfrAe0=.d7e4944e-b781-477a-862b-dc067fab9d13@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <3Z1vY5KHl-D5I1VoQYb6w0B1QToR0cVOnOov_vfrAe0=.d7e4944e-b781-477a-862b-dc067fab9d13@github.com> Message-ID: On Sat, 20 Apr 2024 19:09:43 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix UnsafeCopyMemoryMark scope issue Before I do testing, please sync with mainline. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067777569 From sgibbons at openjdk.org Sat Apr 20 22:31:48 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 20 Apr 2024 22:31:48 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v26] In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: - Merge branch 'openjdk:master' into setMemory - Fix UnsafeCopyMemoryMark scope issue - Long to short jmp; other cleanup - Review comments - Address review comments; update copyright years - Add enter() and leave(); remove Windows-specific register stuff - Fix memory mark after sync to upstream - Merge branch 'openjdk:master' into setMemory - Set memory test (#23) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill * Remove MUSL_LIBC ifdef * Remove MUSL_LIBC ifdef - Set memory test (#22) * Even more review comments * Re-write of atomic copy loops * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} * Only add a memory mark for byte unaligned fill - ... and 27 more: https://git.openjdk.org/jdk/compare/6d569961...1122b500 ------------- Changes: https://git.openjdk.org/jdk/pull/18555/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=25 Stats: 507 lines in 36 files changed: 420 ins; 5 del; 82 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sgibbons at openjdk.org Sat Apr 20 22:31:48 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 20 Apr 2024 22:31:48 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v25] In-Reply-To: <3Z1vY5KHl-D5I1VoQYb6w0B1QToR0cVOnOov_vfrAe0=.d7e4944e-b781-477a-862b-dc067fab9d13@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> <3Z1vY5KHl-D5I1VoQYb6w0B1QToR0cVOnOov_vfrAe0=.d7e4944e-b781-477a-862b-dc067fab9d13@github.com> Message-ID: On Sat, 20 Apr 2024 19:09:43 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix UnsafeCopyMemoryMark scope issue Merge done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2067803696 From stuefe at openjdk.org Sun Apr 21 06:56:15 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sun, 21 Apr 2024 06:56:15 GMT Subject: RFR: 8330677: Add Per-Compilation memory usage to JFR Message-ID: <30ZOBcQtn7hbUWoZ_p034H38c6meZmFQoymtSP0L7oM=.e0d297a8-3a53-4d9b-b26e-2eb4d93549e0@github.com> We have the (opt-in, disabled by default) compiler memory statistics introduced with [JDK-8317683](https://bugs.openjdk.org/browse/JDK-8317683). Since temporary memory usage by compilers can significantly affect process footprint, it would make sense to expose at least the total peak usage per compilation via JFR. --- This patch adds "arena usage" to CompilationEvent. We know see in JMC how costly a compilation had been. (The cost can get very high, as we have seen just recently again with [JDK-8327247](https://bugs.openjdk.org/browse/JDK-8327247) ). ![jmc-memstat](https://github.com/openjdk/jdk/assets/6041414/8cac366a-2a8f-45ca-be40-d419712f81a7) ------------- Commit messages: - JDK-8330677-Add-Per-Compilation-memory-usage-to-JFR Changes: https://git.openjdk.org/jdk/pull/18864/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18864&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330677 Stats: 22 lines in 7 files changed: 16 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18864.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18864/head:pull/18864 PR: https://git.openjdk.org/jdk/pull/18864 From jsjolen at openjdk.org Sun Apr 21 10:21:57 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 21 Apr 2024 10:21:57 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v42] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Use functor instead of function pointer ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/4b51328c..0c0bfbfa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=41 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=40-41 Stats: 25 lines in 2 files changed: 3 ins; 0 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Sun Apr 21 10:21:57 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 21 Apr 2024 10:21:57 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v42] In-Reply-To: References: Message-ID: On Sun, 21 Apr 2024 10:18:15 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Use functor instead of function pointer Hi @tstuefe, Cleaned up Treap significantly, it looks way better now! Thanks for the ideas. ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2013369259 From jsjolen at openjdk.org Sun Apr 21 10:21:57 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 21 Apr 2024 10:21:57 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <0G_oRg-MB6aRKXpHJ4ca8lIQ72ZhsA2WBujtJ8BQaD0=.bbbc53c2-cb49-4051-998e-e9e48e4ea516@github.com> Message-ID: On Wed, 17 Apr 2024 05:19:03 GMT, Thomas Stuefe wrote: >> One annoying part about such a design is that you can't use pointers to values as keys, they must now be wrapped within their own type (like `StackIndex` does). Yes, there's a bit of a smell of YAGNI here, but at least `std::set` agrees with me on `Compare` being a template argument. > > Could the comparator be a funktor then? Something with a static compare function? Sure, I've got a commit changing it out now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1573700023 From jsjolen at openjdk.org Sun Apr 21 10:28:58 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 21 Apr 2024 10:28:58 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v43] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Missed an instance of COMPARATOR::cmp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/0c0bfbfa..29375550 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=42 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=41-42 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Sun Apr 21 10:35:47 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 21 Apr 2024 10:35:47 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v44] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Rename to AddressComparator ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/29375550..0175c0e6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=43 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=42-43 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Sun Apr 21 11:27:35 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sun, 21 Apr 2024 11:27:35 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v44] In-Reply-To: References: Message-ID: On Sun, 21 Apr 2024 10:35:47 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Rename to AddressComparator Windows debug builds failing with following linker errors: === Output from failing command(s) repeated here === * For target hotspot_variant-server_libjvm_gtest_objs_BUILD_GTEST_LIBJVM_link: Creating library d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\gtest\objs\jvm.lib and object d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\gtest\objs\jvm.exp jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayCHeap::`vftable'" (??_7?$GrowableArrayCHeap at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z@$0M@@@6B@) jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayView::`vftable'" (??_7?$GrowableArrayView at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z@@@6B@) jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayWithAllocator >::`vftable'" (??_7?$GrowableArrayWithAllocator at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z at V?$GrowableArrayCHeap at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z@$0M@@@@@6B@) d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\gtest\jvm.dll : fatal error LNK1120: 3 unresolved externals * For target hotspot_variant-server_libjvm_objs_BUILD_LIBJVM_link: Creating library d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\objs\jvm.lib and object d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\objs\jvm.exp jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayCHeap::`vftable'" (??_7?$GrowableArrayCHeap at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z@$0M@@@6B@) jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayView::`vftable'" (??_7?$GrowableArrayView at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z@@@6B@) jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayWithAllocator >::`vftable'" (??_7?$GrowableArrayWithAllocator at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z at V?$GrowableArrayCHeap at UAddressState@?1??register_mapping at VMATree@@QEAA?AUSummaryDiff at 3@_K0W4StateType at 3@AEAUMetadata at 3@@Z@$0M@@@@@6B@) d:\a\jdk\jdk\build\windows-aarch64\support\modules_libs\java.base\server\jvm.dll : fatal error LNK1120: 3 unresolved externals ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2068009356 From kvn at openjdk.org Sun Apr 21 16:45:39 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 21 Apr 2024 16:45:39 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v26] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Sat, 20 Apr 2024 22:31:48 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: > > - Merge branch 'openjdk:master' into setMemory > - Fix UnsafeCopyMemoryMark scope issue > - Long to short jmp; other cleanup > - Review comments > - Address review comments; update copyright years > - Add enter() and leave(); remove Windows-specific register stuff > - Fix memory mark after sync to upstream > - Merge branch 'openjdk:master' into setMemory > - Set memory test (#23) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > > * Remove MUSL_LIBC ifdef > > * Remove MUSL_LIBC ifdef > - Set memory test (#22) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > - ... and 27 more: https://git.openjdk.org/jdk/compare/6d569961...1122b500 My testing passed. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2013478795 From sgibbons at openjdk.org Sun Apr 21 21:01:38 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sun, 21 Apr 2024 21:01:38 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v26] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Sat, 20 Apr 2024 22:31:48 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: > > - Merge branch 'openjdk:master' into setMemory > - Fix UnsafeCopyMemoryMark scope issue > - Long to short jmp; other cleanup > - Review comments > - Address review comments; update copyright years > - Add enter() and leave(); remove Windows-specific register stuff > - Fix memory mark after sync to upstream > - Merge branch 'openjdk:master' into setMemory > - Set memory test (#23) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > > * Remove MUSL_LIBC ifdef > > * Remove MUSL_LIBC ifdef > - Set memory test (#22) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > - ... and 27 more: https://git.openjdk.org/jdk/compare/6d569961...1122b500 Thank you all for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2068196116 From jbhateja at openjdk.org Sun Apr 21 23:27:44 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 21 Apr 2024 23:27:44 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v26] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Sat, 20 Apr 2024 22:31:48 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: > > - Merge branch 'openjdk:master' into setMemory > - Fix UnsafeCopyMemoryMark scope issue > - Long to short jmp; other cleanup > - Review comments > - Address review comments; update copyright years > - Add enter() and leave(); remove Windows-specific register stuff > - Fix memory mark after sync to upstream > - Merge branch 'openjdk:master' into setMemory > - Set memory test (#23) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > > * Remove MUSL_LIBC ifdef > > * Remove MUSL_LIBC ifdef > - Set memory test (#22) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > - ... and 27 more: https://git.openjdk.org/jdk/compare/6d569961...1122b500 Marked as reviewed by jbhateja (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18555#pullrequestreview-2013564907 From sgibbons at openjdk.org Sun Apr 21 23:27:45 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sun, 21 Apr 2024 23:27:45 GMT Subject: Integrated: 8329331: Intrinsify Unsafe::setMemory In-Reply-To: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Fri, 29 Mar 2024 22:32:06 GMT, Scott Gibbons wrote: > This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. > > Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. > > Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). > > [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) This pull request has now been integrated. Changeset: bd67ac69 Author: Scott Gibbons Committer: Jatin Bhateja URL: https://git.openjdk.org/jdk/commit/bd67ac69a234cd1096e534c7d4a45d88715884b4 Stats: 507 lines in 36 files changed: 420 ins; 5 del; 82 mod 8329331: Intrinsify Unsafe::setMemory Reviewed-by: sviswanathan, jbhateja, kvn ------------- PR: https://git.openjdk.org/jdk/pull/18555 From fjiang at openjdk.org Mon Apr 22 02:20:51 2024 From: fjiang at openjdk.org (Feilong Jiang) Date: Mon, 22 Apr 2024 02:20:51 GMT Subject: RFR: 8330735: RISC-V: No need to move sp to tmp register in set_last_Java_frame Message-ID: Hi, please review this refactoring to remove the unnecessary move from sp to temp register. There is no restriction for riscv when using `sp` as an operand in instructions. So we do not have to move the sp register to a temp register before we store `last_java_sp`. Testing: - [x] Tier1-3 (linux-riscv64, release) ------------- Commit messages: - unnecessary mv from sp to tmp register Changes: https://git.openjdk.org/jdk/pull/18875/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18875&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330735 Stats: 12 lines in 3 files changed: 0 ins; 4 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18875.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18875/head:pull/18875 PR: https://git.openjdk.org/jdk/pull/18875 From fyang at openjdk.org Mon Apr 22 02:51:33 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 22 Apr 2024 02:51:33 GMT Subject: RFR: 8330735: RISC-V: No need to move sp to tmp register in set_last_Java_frame In-Reply-To: References: Message-ID: <8fd33mO9lVD6h6KrzRzNeiZqz8-v8a6Fr-4LshUu2l0=.a7ebd433-e79e-4aef-ac6a-7b2718a0dbf2@github.com> On Sat, 20 Apr 2024 12:42:20 GMT, Feilong Jiang wrote: > Hi, please review this refactoring to remove the unnecessary move from sp to temp register. > > There is no restriction for riscv when using `sp` as an operand in instructions. So we do not have to move the sp register to a temp register before we store `last_java_sp`. > > Testing: > > - [x] Tier1-3 (linux-riscv64, release) Looks fine. Thanks for the cleanup! ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18875#pullrequestreview-2013678102 From dholmes at openjdk.org Mon Apr 22 05:31:27 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 22 Apr 2024 05:31:27 GMT Subject: RFR: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE In-Reply-To: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> References: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> Message-ID: On Wed, 17 Apr 2024 16:49:25 GMT, Gerard Ziemski wrote: > This is a 3rd attempt of the same fix: > > 1st one had to be pulled out because of a bug in zsh > 2nd one had a workaround for the bug in zsh, but then uncovered an issue in JWDP (JDK-8324668), which was subsequently fixed. > > Tested with MACH5 tier1-9 with no unique or new failures on macOS LGTM2 Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18821#pullrequestreview-2013784734 From aboldtch at openjdk.org Mon Apr 22 05:36:41 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 22 Apr 2024 05:36:41 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v5] In-Reply-To: References: Message-ID: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Tested Tier 1-8 + Stress testing reproducers. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Use raw bytecode read for previous bytecode ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/10d70ea1..8d815e9f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=03-04 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From aboldtch at openjdk.org Mon Apr 22 05:39:58 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 22 Apr 2024 05:39:58 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: Message-ID: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Tested Tier 1-8 + Stress testing reproducers. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Fix condition ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/8d815e9f..578a8322 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From rehn at openjdk.org Mon Apr 22 07:09:33 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 22 Apr 2024 07:09:33 GMT Subject: RFR: 8330156: RISC-V: Range check auipc + signed 12 imm instruction [v5] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 14:19:29 GMT, Robbin Ehn wrote: >> Hi please consider! >> >> Today we check if the distance is a signed 32. >> As the second instruction have sign bit + 11 bits the, max of such pair is shorter. >> >> Sanity tested > > Robbin Ehn has updated the pull request incrementally with two additional commits since the last revision: > > - Added comment > - Rename /back-port jdk17u-dev ------------- PR Comment: https://git.openjdk.org/jdk/pull/18755#issuecomment-2068640873 From dholmes at openjdk.org Mon Apr 22 08:00:29 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 22 Apr 2024 08:00:29 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> References: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> Message-ID: On Fri, 19 Apr 2024 15:30:27 GMT, Coleen Phillimore wrote: >> It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into vme > - Throw preallocated SOE object. > - We don't need to link and initialize VirtualMachineError class because the lines just below it that link and initialize and create an instance of StackOverflowError will do that, since VME is a subclass of SOE. > - Remove newline > - 8330578: The VM creates instance of abstract class VirtualMachineError There are a couple of uses of NPE that may be better handled by `InternalError` - so you could just replace the `_virtual_machine_error` bits with `_internal_error` instead. Thanks. src/hotspot/share/utilities/exceptions.cpp line 124: > 122: exc_value, message ? ": " : "", message ? message : "", > 123: p2i(h_exception()), file, line, p2i(thread), > 124: Universe::null_ptr_exception_instance()->print_value_string()); I think this will look odd. Throwing `InternalError` may be more appropriate. src/hotspot/share/utilities/exceptions.cpp line 127: > 125: // We do not care what kind of exception we get for a thread which > 126: // is compiling. We just install a dummy exception object > 127: thread->set_pending_exception(Universe::null_ptr_exception_instance(), file, line); Ditto - throw `InternalError`. ------------- PR Review: https://git.openjdk.org/jdk/pull/18847#pullrequestreview-2013977402 PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1574265874 PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1574266066 From tschatzl at openjdk.org Mon Apr 22 08:12:34 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Apr 2024 08:12:34 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v3] In-Reply-To: References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: On Sat, 20 Apr 2024 08:44:45 GMT, Lei Zaakjyu wrote: >> follow up 8267941 > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > also tidy up Indentation issues. I will run the higher tier SA tests just for verification. There are still more classes related to `HeapRegion` that need the G1 prefix. Not entirely sure they should be changed with this change (probably not exhaustive) but it would be desirable to change them as well rather sooner than later: HeapRegionClosure HeapRegionRange HeapRegionIndexClosure HeapRegionRemSet HeapRegionSetBase HeapRegionSetChecker HeapRegionSet FreeRegionListIterator FreeRegionList src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 157: > 155: > 156: G1HeapRegion* G1CollectedHeap::new_heap_region(uint hrs_index, > 157: MemRegion mr) { Indentation src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 166: > 164: HeapRegionType type, > 165: bool do_expand, > 166: uint node_index) { Indentation src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 999: > 997: size_t aligned_expand_bytes = ReservedSpace::page_align_size_up(expand_bytes); > 998: aligned_expand_bytes = align_up(aligned_expand_bytes, > 999: G1HeapRegion::GrainBytes); Just use a single line, otherwise indent correctly. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1044: > 1042: ReservedSpace::page_align_size_down(shrink_bytes); > 1043: aligned_shrink_bytes = align_down(aligned_shrink_bytes, > 1044: G1HeapRegion::GrainBytes); Indentation/use single line. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 2862: > 2860: HeapRegionType::Eden, > 2861: false /* do_expand */, > 2862: node_index); Indentation src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 2916: > 2914: type, > 2915: true /* do_expand */, > 2916: node_index); Indentation src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 396: > 394: HeapRegionType type, > 395: bool do_expand, > 396: uint node_index = G1NUMA::AnyNodeIndex); Indentation src/hotspot/share/gc/g1/g1CollectionSetChooser.hpp line 43: > 41: public: > 42: static size_t mixed_gc_live_threshold_bytes() { > 43: return G1HeapRegion::GrainBytes * (size_t) G1MixedGCLiveThresholdPercent / 100; Suggestion: return G1HeapRegion::GrainBytes * (size_t)G1MixedGCLiveThresholdPercent / 100; Pre-existing src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 537: > 535: // committed, expand at that index. > 536: for (uint curr = reserved_length(); curr-- > 0;) { > 537: G1HeapRegion *hr = _regions.get_by_index(curr); Suggestion: G1HeapRegion* hr = _regions.get_by_index(curr); Pre-existing src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 805: > 803: FreeRegionList *free_list = worker_freelist(worker_id); > 804: for (uint i = start; i < end; i++) { > 805: G1HeapRegion *region = _hrm->at_or_null(i); Suggestion: G1HeapRegion* region = _hrm->at_or_null(i); src/hotspot/share/gc/g1/g1HeapRegionSet.hpp line 248: > 246: private: > 247: FreeRegionList* _list; > 248: G1HeapRegion* _curr; Suggestion: G1HeapRegion* _curr; src/hotspot/share/gc/g1/g1HeapVerifier.cpp line 201: > 199: G1CollectedHeap* _g1h; > 200: size_t _live_bytes; > 201: G1HeapRegion *_hr; Suggestion: G1HeapRegion* _hr; pre-existing src/hotspot/share/gc/g1/g1HeapVerifier.cpp line 205: > 203: > 204: public: > 205: VerifyObjsInRegionClosure(G1HeapRegion *hr, VerifyOption vo) Suggestion: VerifyObjsInRegionClosure(G1HeapRegion* hr, VerifyOption vo) src/hotspot/share/gc/g1/vmStructs_g1.hpp line 38: > 36: \ > 37: static_field(G1HeapRegion, GrainBytes, size_t) \ > 38: static_field(G1HeapRegion, LogOfHRGrainBytes, uint) \ Suggestion: static_field(G1HeapRegion, GrainBytes, size_t) \ static_field(G1HeapRegion, LogOfHRGrainBytes, uint) \ src/hotspot/share/gc/g1/vmStructs_g1.hpp line 44: > 42: nonstatic_field(G1HeapRegion, _top, HeapWord* volatile) \ > 43: nonstatic_field(G1HeapRegion, _end, HeapWord* const) \ > 44: volatile_nonstatic_field(G1HeapRegion, _pinned_object_count, size_t) \ Suggestion: nonstatic_field(G1HeapRegion, _type, HeapRegionType) \ nonstatic_field(G1HeapRegion, _bottom, HeapWord* const) \ nonstatic_field(G1HeapRegion, _top, HeapWord* volatile) \ nonstatic_field(G1HeapRegion, _end, HeapWord* const) \ volatile_nonstatic_field(G1HeapRegion, _pinned_object_count, size_t) \ src/hotspot/share/gc/g1/vmStructs_g1.hpp line 96: > 94: declare_type(G1CollectedHeap, CollectedHeap) \ > 95: \ > 96: declare_toplevel_type(G1HeapRegion) \ Suggestion: declare_toplevel_type(G1HeapRegion) \ src/hotspot/share/gc/g1/vmStructs_g1.hpp line 106: > 104: \ > 105: declare_toplevel_type(G1CollectedHeap*) \ > 106: declare_toplevel_type(G1HeapRegion*) \ Suggestion: declare_toplevel_type(G1HeapRegion*) \ src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/g1/G1HeapRegionTable.java line 93: > 91: } > 92: > 93: private class HeapRegionIterator implements Iterator { Should probably be `G1HeapRegionIterator` src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/tools/HeapSummary.java line 90: > 88: printValMB("MaxMetaspaceSize = ", getFlagValue("MaxMetaspaceSize", flagMap)); > 89: if (heap instanceof G1CollectedHeap) { > 90: printValMB("G1HeapRegionSize = ", G1HeapRegion.grainBytes()); Suggestion: printValMB("G1HeapRegionSize = ", G1HeapRegion.grainBytes()); ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18871#pullrequestreview-2013971784 PR Review: https://git.openjdk.org/jdk/pull/18871#pullrequestreview-2014068153 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574262994 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574263232 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574264505 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574264972 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574266207 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574266517 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574267075 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574271446 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574277670 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574278045 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574280865 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574281852 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574282056 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574293124 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574293556 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574293917 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574294107 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574295929 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1574297949 From duke at openjdk.org Mon Apr 22 08:43:52 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 22 Apr 2024 08:43:52 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v8] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request incrementally with one additional commit since the last revision: Rollback change in Assembler::offset() and resolve conflict ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18467/files - new: https://git.openjdk.org/jdk/pull/18467/files/4bd183fb..1e9fb025 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=06-07 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From tschatzl at openjdk.org Mon Apr 22 09:42:28 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 22 Apr 2024 09:42:28 GMT Subject: RFR: 8330463: Rename invalidate() to write_region() in ModRefBarrierSet In-Reply-To: References: Message-ID: <2QQD4CWpZdQoWX_YOB22RH0OSjL7musjPCATTcXE5XA=.1df8ab99-e54d-436a-8372-f0ab9bd7edaf@github.com> On Wed, 17 Apr 2024 06:52:48 GMT, Albert Mingkun Yang wrote: > Simple renaming of a barrier-set API. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18808#pullrequestreview-2014276908 From eosterlund at openjdk.org Mon Apr 22 09:42:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 Apr 2024 09:42:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v4] In-Reply-To: References: Message-ID: > When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. > > At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. > > In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). > > Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Patricio comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18643/files - new: https://git.openjdk.org/jdk/pull/18643/files/170b3184..17b155d2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18643&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18643/head:pull/18643 PR: https://git.openjdk.org/jdk/pull/18643 From eosterlund at openjdk.org Mon Apr 22 09:42:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 Apr 2024 09:42:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: <-GWGR7FPUMDBs4qvaDfnDR6Jfq9QKsQxNorjln_n-Ns=.16653cde-4cc5-4978-a385-41ebcc8e49c2@github.com> Message-ID: On Mon, 8 Apr 2024 15:01:01 GMT, Patricio Chilano Mateo wrote: >>> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >>> >> So if the race happens only when resetting the stackChunk values when thawing the last frame, wouldn't it be enough to avoid clearing the argsize there? Because if we read the new sp when creating the stack frame iterator, regardless of the argsize value read, is_done() will be true so we won't iterate any frame. I'm trying to understand if the new model is needed to fix the race or that is part of a cleanup/refactoring. > >> Unlike thawing, the freeze operation does not race with the GC by design. >> > Is this with the changes in the allocation code in this patch or even before those there was no race? Thanks for the detailed review @pchilano! I made the suggested updates. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18643#issuecomment-2068944427 From eosterlund at openjdk.org Mon Apr 22 09:42:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 Apr 2024 09:42:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:34:17 GMT, Patricio Chilano Mateo wrote: >> I currently have an assert that checks that you shouldn't be asking for the argsize() if the chunk is empty, because it is so error prone. I think I'd like to keep the assert though - it was quite useful. > > We should be okay since _cont.argsize gets it from the ContinuationEntry. I tested it a bit and we would also need to update _fast_freeze_size to be cont_size() in the chunk empty case before calling freeze_fast_copy() otherwise we hit an assert there. But I can do this in another RFR if you want. That would be great, thank you! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1574450463 From eosterlund at openjdk.org Mon Apr 22 09:42:42 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 Apr 2024 09:42:42 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v3] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:40:29 GMT, Patricio Chilano Mateo wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Partricio fixes > > src/hotspot/share/runtime/continuationFreezeThaw.cpp line 624: > >> 622: freeze_fast_copy(chunk, chunk_start_sp CONT_JFR_ONLY(COMMA false)); >> 623: } else { // the chunk is empty >> 624: const int chunk_start_sp = chunk->stack_size() - frame::metadata_words_at_top; > > Do we need the minus frame::metadata_words_at_top? Since the chunk is empty I would expect the chunk_start_sp to be the same as for the new chunk case which is just chunk->stack_size(). I think you are right; I changed it to chunk->stack_size() as you propose, and adjusted the bottom to remove the metadata part instead. I think this looks nicer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1574452267 From ayang at openjdk.org Mon Apr 22 10:08:32 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 Apr 2024 10:08:32 GMT Subject: RFR: 8330463: Rename invalidate() to write_region() in ModRefBarrierSet In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 06:52:48 GMT, Albert Mingkun Yang wrote: > Simple renaming of a barrier-set API. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18808#issuecomment-2068994224 From ayang at openjdk.org Mon Apr 22 10:08:32 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 Apr 2024 10:08:32 GMT Subject: Integrated: 8330463: Rename invalidate() to write_region() in ModRefBarrierSet In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 06:52:48 GMT, Albert Mingkun Yang wrote: > Simple renaming of a barrier-set API. This pull request has now been integrated. Changeset: f889797e Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/f889797e1fa6bc3824d97912643a33696d367af3 Stats: 18 lines in 7 files changed: 0 ins; 6 del; 12 mod 8330463: Rename invalidate() to write_region() in ModRefBarrierSet Reviewed-by: gli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18808 From duke at openjdk.org Mon Apr 22 11:13:41 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 22 Apr 2024 11:13:41 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v9] In-Reply-To: References: Message-ID: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. > > - Update: > After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction. kuaiwei has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge master - Rollback change in Assembler::offset() and resolve conflict - Fix arm build error - Cleanup unused _last_label_code - Simplify code - Fix cross build error - Move fsm to CodeBuffer - Add fsm for merging - 8328876: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier ------------- Changes: https://git.openjdk.org/jdk/pull/18467/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=08 Stats: 622 lines in 21 files changed: 590 ins; 5 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From duke at openjdk.org Mon Apr 22 11:18:29 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 22 Apr 2024 11:18:29 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 02:08:58 GMT, kuaiwei wrote: > > Argh, I found it. It happens because C2 calls `masm->offset()` from `PhaseOutput::fill_buffer()` after every node is emitted. So that trick isn't going to work. > > It was worth a try, but given that C2 expects offset() to be correct after every node, I think we're stuck. Maybe the last idea you had is the best possible without C2 tinkering. > > got it. I will check if we can make offset() work with fill_buffer. Or I will rollback the change of offset(). I rolled back the offset() change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2069129064 From coleenp at openjdk.org Mon Apr 22 11:33:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 11:33:30 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: References: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> Message-ID: On Mon, 22 Apr 2024 07:35:06 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Merge branch 'master' into vme >> - Throw preallocated SOE object. >> - We don't need to link and initialize VirtualMachineError class because the lines just below it that link and initialize and create an instance of StackOverflowError will do that, since VME is a subclass of SOE. >> - Remove newline >> - 8330578: The VM creates instance of abstract class VirtualMachineError > > src/hotspot/share/utilities/exceptions.cpp line 124: > >> 122: exc_value, message ? ": " : "", message ? message : "", >> 123: p2i(h_exception()), file, line, p2i(thread), >> 124: Universe::null_ptr_exception_instance()->print_value_string()); > > I think this will look odd. Throwing `InternalError` may be more appropriate. Unfortunately, we don't pre-allocate an instance of InternalError. // We do not care what kind of exception we get for a thread which // is compiling. We just install a dummy exception object thread->set_pending_exception(Universe::null_ptr_exception_instance(), file, line); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1574601083 From coleenp at openjdk.org Mon Apr 22 11:39:28 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 11:39:28 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: References: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> Message-ID: On Mon, 22 Apr 2024 11:30:42 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/exceptions.cpp line 124: >> >>> 122: exc_value, message ? ": " : "", message ? message : "", >>> 123: p2i(h_exception()), file, line, p2i(thread), >>> 124: Universe::null_ptr_exception_instance()->print_value_string()); >> >> I think this will look odd. Throwing `InternalError` may be more appropriate. > > Unfortunately, we don't pre-allocate an instance of InternalError. > > // We do not care what kind of exception we get for a thread which > // is compiling. We just install a dummy exception object > thread->set_pending_exception(Universe::null_ptr_exception_instance(), file, line); I have to put all this code back to create, store for CDS and use InternalError. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1574605176 From aph at openjdk.org Mon Apr 22 12:32:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Apr 2024 12:32:28 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... You only need one load, add, and multiply per iteration. You don't need to add across columns until the end. This is an example of how to do it. The full thing is at https://gist.github.com/theRealAph/cbc85299d6cd24101d46a06c12a97ce6. public static int vectorizedHashCode(int result, int[] a, int fromIndex, int length) { if (length < WIDTH) { return hashCode(result, a, fromIndex, length); } int offset = fromIndex; int[] sum = new int[WIDTH]; sum[WIDTH - 1] = result; int[] temp = new int[WIDTH]; int remaining = length; while (remaining >= WIDTH * 2) { vmult(sum, sum, n31powerWIDTH); vload(temp, a, offset); vadd(sum, sum, temp); offset += WIDTH; remaining -= WIDTH; } vmult(sum, sum, n31powerWIDTH); vload(temp, a, offset); vadd(sum, sum, temp); vmult(sum, sum, n31powersToWIDTH); offset += WIDTH; remaining -= WIDTH; result = vadd(sum); return hashCode(result, a, fromIndex + offset, remaining); } ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2069274761 From aph at openjdk.org Mon Apr 22 12:42:34 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Apr 2024 12:42:34 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: <8fxyjDemTASbl-nmYr9xJrftsv1XOCc4BvtsNdwCs7E=.8e360802-c753-4f81-80e8-50692951f908@github.com> On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... In addition, doing only one vector per iteration is very wasteful. A high-performance AArch64 implementation can issue four multiply-accumulate vector instructions _per cycle_, with a 3-clock latency. By only issuing a single multiply-accumulate per iteration you're leaving a lot of performance on the table. I'd try to make the bulk width 16, and measure from there. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2069294467 From aph at openjdk.org Mon Apr 22 12:53:34 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Apr 2024 12:53:34 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: <2stGKhgwZPG0HXj65IZioZBlOud2FMcTqGe89_ggCzs=.088f733d-f156-4178-8020-0b7b84c8764d@github.com> On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... You only need one load, add, and multiply per iteration. You don't need to add across columns until the end. This is an example of how to do it. The full thing is at https://gist.github.com/theRealAph/cbc85299d6cd24101d46a06c12a97ce6. public static int vectorizedHashCode(int result, int[] a, int fromIndex, int length) { if (length < WIDTH) { return hashCode(result, a, fromIndex, length); } int offset = fromIndex; int[] sum = new int[WIDTH]; sum[WIDTH - 1] = result; int[] temp = new int[WIDTH]; int remaining = length; while (remaining >= WIDTH * 2) { vmult(sum, sum, n31powerWIDTH); vload(temp, a, offset); vadd(sum, sum, temp); offset += WIDTH; remaining -= WIDTH; } vmult(sum, sum, n31powerWIDTH); vload(temp, a, offset); vadd(sum, sum, temp); vmult(sum, sum, n31powersToWIDTH); offset += WIDTH; remaining -= WIDTH; result = vadd(sum); return hashCode(result, a, fromIndex + offset, remaining); } ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2069318463 From jsjolen at openjdk.org Mon Apr 22 12:55:36 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Apr 2024 12:55:36 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. A couple of questions, looks good over all. src/hotspot/os/windows/os_windows.cpp line 5108: > 5106: > 5107: base = (char*) virtualAlloc(addr, bytes, MEM_COMMIT | MEM_RESERVE, > 5108: PAGE_READWRITE); Why is this removed? src/hotspot/share/memory/virtualspace.cpp line 45: > 43: // Dummy constructor > 44: ReservedSpace::ReservedSpace() : _base(nullptr), _size(0), _noaccess_prefix(0), > 45: _alignment(0), _fd_for_heap(-1), _special(false), _executable(false), _nmt_flag(mtNone) { Isn't just `_flag` or `_memflag` sufficient as a name for `ReservedSpace`? We don' use `nmt_flag` anywhere else in the codebase. ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2014652979 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1574692818 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1574703805 From ayang at openjdk.org Mon Apr 22 13:47:52 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 Apr 2024 13:47:52 GMT Subject: RFR: 8330822: Remove ModRefBarrierSet::write_ref_array_work Message-ID: Simple merging a protected api into another method. ------------- Commit messages: - remove-write-ref-array-work Changes: https://git.openjdk.org/jdk/pull/18887/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18887&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330822 Stats: 16 lines in 6 files changed: 0 ins; 15 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18887.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18887/head:pull/18887 PR: https://git.openjdk.org/jdk/pull/18887 From sgibbons at openjdk.org Mon Apr 22 13:54:51 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 13:54:51 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory Message-ID: Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. ------------- Commit messages: - Rename UnsafeCopyMemory to UnsafeMemoryAccess Changes: https://git.openjdk.org/jdk/pull/18889/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18889&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330821 Stats: 159 lines in 18 files changed: 0 ins; 0 del; 159 mod Patch: https://git.openjdk.org/jdk/pull/18889.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18889/head:pull/18889 PR: https://git.openjdk.org/jdk/pull/18889 From jkratochvil at openjdk.org Mon Apr 22 13:59:30 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Mon, 22 Apr 2024 13:59:30 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container In-Reply-To: References: Message-ID: On Fri, 19 Apr 2024 05:18:51 GMT, Laurence Cable wrote: > I think (I am agreeing with you Severin) that the goal of the heuristic is to inform the JVM (and any associated serviceability tools) that the JVM is in a resource constrained/managed execution context... "resource constrained" (my patch) vs. "managed" (this patch) is the difference of the two patches being discussed. Anyway in this patch one could unify naming across variables/parameters, the same value is called `_is_ro`, `is_read_only`, `ro_opt`, `read_only`, `ro`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2069537759 From sgibbons at openjdk.org Mon Apr 22 14:16:06 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 14:16:06 GMT Subject: RFR: 8320448: Accelerate IndexOf using AVX2 [v16] In-Reply-To: References: Message-ID: <6UnFG26aCrqCe5egk5hKsogxeOBNNdUuHfGveP82n_4=.7be96c8b-a2b9-4bdd-90f1-65ff60b5ae7f@github.com> > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 48 commits: - Merge branch 'openjdk:master' into indexof - Remove infinite loop (used for debugging) - Merge branch 'openjdk:master' into indexof - Cleaned up, ready for review - Pre-cleanup code - Add JMH. Add 16-byte compares to arrays_equals - Better method for mask creation - Merge branch 'openjdk:master' into indexof - Most cleanup done. - Remove header dependency - ... and 38 more: https://git.openjdk.org/jdk/compare/3e65d90b...8e0ce70a ------------- Changes: https://git.openjdk.org/jdk/pull/16753/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=15 Stats: 4903 lines in 19 files changed: 4549 ins; 241 del; 113 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From coleenp at openjdk.org Mon Apr 22 14:40:44 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 14:40:44 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v6] In-Reply-To: References: Message-ID: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add to comment why NPE is installed in special_exception. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18847/files - new: https://git.openjdk.org/jdk/pull/18847/files/43805500..e4e1f8d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=04-05 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From duke at openjdk.org Mon Apr 22 14:45:29 2024 From: duke at openjdk.org (Mikhail Ablakatov) Date: Mon, 22 Apr 2024 14:45:29 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2stGKhgwZPG0HXj65IZioZBlOud2FMcTqGe89_ggCzs=.088f733d-f156-4178-8020-0b7b84c8764d@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> <2stGKhgwZPG0HXj65IZioZBlOud2FMcTqGe89_ggCzs=.088f733d-f156-4178-8020-0b7b84c8764d@github.com> Message-ID: On Mon, 22 Apr 2024 12:51:06 GMT, Andrew Haley wrote: > You only need one load, add, and multiply per iteration. > You don't need to add across columns until the end. > > This is an example of how to do it. The full thing is at https://gist.github.com/theRealAph/cbc85299d6cd24101d46a06c12a97ce6. Looks reasonable, thank you for providing the listing! I'll revert back on this once I have updated performance numbers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2069715012 From duke at openjdk.org Mon Apr 22 14:48:28 2024 From: duke at openjdk.org (Mikhail Ablakatov) Date: Mon, 22 Apr 2024 14:48:28 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... > A high-performance AArch64 implementation can issue four multiply-accumulate vector instructions per cycle, with a 3-clock latency. @theRealAph , hmph, could you elaborate on what spec you refer to here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2069727954 From jsjolen at openjdk.org Mon Apr 22 14:56:49 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Apr 2024 14:56:49 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v45] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Sort the ordering - Other way around test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/0175c0e6..e668b569 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=44 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=43-44 Stats: 44 lines in 2 files changed: 39 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Mon Apr 22 14:56:49 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Apr 2024 14:56:49 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v44] In-Reply-To: References: Message-ID: On Sun, 21 Apr 2024 10:35:47 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Rename to AddressComparator Fixed a bug that Afshin found w.r.t. summary accounting. The double-arrow thing really helped out there, so I'm happy with keeping that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2069757120 From rkennke at openjdk.org Mon Apr 22 15:05:37 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 Apr 2024 15:05:37 GMT Subject: RFR: 8330585: Refactor/rename forwardee handling [v2] In-Reply-To: References: Message-ID: <33mfblIBDxgf17cw5S4VbZCQIu5kd-P_0gyHeOy8gf8=.e59ed643-71b0-40a5-9745-414c755fcf18@github.com> On Fri, 19 Apr 2024 13:55:08 GMT, Roman Kennke wrote: >> In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. >> >> This change cleans it up. It introduces a bunch of APIs to facilitate that: >> - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). >> - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). >> >> Testing: >> - [x] hotspot_gc >> - [x] tier1 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Don't add API in oopDesc Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18863#issuecomment-2069793674 From rkennke at openjdk.org Mon Apr 22 15:05:37 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 Apr 2024 15:05:37 GMT Subject: Integrated: 8330585: Refactor/rename forwardee handling In-Reply-To: References: Message-ID: On Fri, 19 Apr 2024 12:25:58 GMT, Roman Kennke wrote: > In several places in GCs we use is_marked() where we really mean is_forwarded(), and do weird things like decode forwardee directly from a markWord instead of using a proper helper, etc. > > This change cleans it up. It introduces a bunch of APIs to facilitate that: > - oopDesc::forwardee(markWord): This doesn't have to be in oopDesc right now, but I'd like to put it there in preparation of https://bugs.openjdk.org/browse/JDK-8305898, which requires it to be in oopDesc. Also, it's nice as a non-racy companion of oopDesc::forwardee(). > - oopDesc::is_forwarded(markWord): It doesn't have to be in oopDesc, either, but I think it's good to have it at the same level of API abstraction as oopDesc::forwardee(markWord). > > Testing: > - [x] hotspot_gc > - [x] tier1 This pull request has now been integrated. Changeset: 7e421ce9 Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/7e421ce9d089ce3e36336fca0f603bcbfbbda6c5 Stats: 21 lines in 6 files changed: 7 ins; 3 del; 11 mod 8330585: Refactor/rename forwardee handling Reviewed-by: stefank, ayang ------------- PR: https://git.openjdk.org/jdk/pull/18863 From iklam at openjdk.org Mon Apr 22 15:37:29 2024 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 22 Apr 2024 15:37:29 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: References: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> Message-ID: On Mon, 22 Apr 2024 11:34:09 GMT, Coleen Phillimore wrote: >> Unfortunately, we don't pre-allocate an instance of InternalError. >> >> // We do not care what kind of exception we get for a thread which >> // is compiling. We just install a dummy exception object >> thread->set_pending_exception(Universe::null_ptr_exception_instance(), file, line); > > I have to put all this code back to create, store for CDS and use InternalError. > Unfortunately, we don't pre-allocate an instance of InternalError. But you just need to replaced the lines that you deleted for VirtualMachineError with InternalError. I think that would be cleaner than having to explain a weird behavior in comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1574969936 From aph at openjdk.org Mon Apr 22 15:51:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 22 Apr 2024 15:51:28 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Mon, 22 Apr 2024 14:45:49 GMT, Mikhail Ablakatov wrote: > > A high-performance AArch64 implementation can issue four multiply-accumulate vector instructions per cycle, with a 3-clock latency. > > @theRealAph , hmph, could you elaborate on what spec you refer to here? That's not so much a spec, more Dougall's measured Apple M1 performance: https://dougallj.github.io/applecpu/measurements/firestorm/UMLAL_v_4S.html. Other high-end AArch64 designs can't do that, but they won't suffer by going wider. We should be able to sustain pipelined 4 int-wide elements/cycle. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2069971112 From iklam at openjdk.org Mon Apr 22 15:55:33 2024 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 22 Apr 2024 15:55:33 GMT Subject: RFR: 8314846: Do not store Klass::_secondary_super_cache in CDS archive In-Reply-To: <9dNGixgm_3Wd7oMp_nnIhaxPYcFh9TBZZwCWcvMlQfA=.84222e0f-9f6a-4197-ba37-af95ac3124a3@github.com> References: <9dNGixgm_3Wd7oMp_nnIhaxPYcFh9TBZZwCWcvMlQfA=.84222e0f-9f6a-4197-ba37-af95ac3124a3@github.com> Message-ID: <7HyOyHUtmUn20z8psnkzk59Lz-pi5_qml1p74JjXtKk=.7e371f1f-88fa-4dd8-91ee-e6c332938cab@github.com> On Fri, 19 Apr 2024 06:51:33 GMT, Thomas Stuefe wrote: >> This bug was found during Leyden development. >> >> CDS's `ArchiveBuilder` expects the class metadata to stop mutating while we're inside the CDS dumping safepoint. However, `Klass::_secondary_super_cache` can be updated as a side effect of `Klass::is_subtype_of()`. >> >> Currently, we don't call `Klass::is_subtype_of()`inside the CDS safepoint. However, it's likely that future optimizations will make such calls (as being done in the Leyden prototype). When that happens, the CDS dump will fail with a hard-to-debug failure (some class is found inside `_secondary_super_cache` that `ArchiveBuilder` doesn't know about. >> >> There's no benefit in storing `Klass::_secondary_super_cache` in the CDS archive. So the safest thing to do is to stop scanning it during CDS dumping, and clear it to `nullptr` when the `Klass` is stored in the CDS archive. > > Makes sense. Thanks @tstuefe @theRealAph for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/18848#issuecomment-2069990483 From iklam at openjdk.org Mon Apr 22 15:55:33 2024 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 22 Apr 2024 15:55:33 GMT Subject: Integrated: 8314846: Do not store Klass::_secondary_super_cache in CDS archive In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 22:19:31 GMT, Ioi Lam wrote: > This bug was found during Leyden development. > > CDS's `ArchiveBuilder` expects the class metadata to stop mutating while we're inside the CDS dumping safepoint. However, `Klass::_secondary_super_cache` can be updated as a side effect of `Klass::is_subtype_of()`. > > Currently, we don't call `Klass::is_subtype_of()`inside the CDS safepoint. However, it's likely that future optimizations will make such calls (as being done in the Leyden prototype). When that happens, the CDS dump will fail with a hard-to-debug failure (some class is found inside `_secondary_super_cache` that `ArchiveBuilder` doesn't know about. > > There's no benefit in storing `Klass::_secondary_super_cache` in the CDS archive. So the safest thing to do is to stop scanning it during CDS dumping, and clear it to `nullptr` when the `Klass` is stored in the CDS archive. This pull request has now been integrated. Changeset: 20be5e09 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/20be5e095f85d92215df68bb6eeb621b4ed249a1 Stats: 6 lines in 1 file changed: 5 ins; 1 del; 0 mod 8314846: Do not store Klass::_secondary_super_cache in CDS archive Reviewed-by: stuefe, aph ------------- PR: https://git.openjdk.org/jdk/pull/18848 From jsjolen at openjdk.org Mon Apr 22 16:00:51 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Apr 2024 16:00:51 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Remove faulty condition after removing merging - Add failing test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/e668b569..e7f2af9e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=45 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=44-45 Stats: 24 lines in 2 files changed: 16 ins; 2 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From pchilanomate at openjdk.org Mon Apr 22 16:09:30 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 22 Apr 2024 16:09:30 GMT Subject: RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v4] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 09:42:42 GMT, Erik ?sterlund wrote: >> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp. >> >> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit. >> >> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize). >> >> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Patricio comments Thanks Erik! ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-2015221427 From sgibbons at openjdk.org Mon Apr 22 16:27:48 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 16:27:48 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 Message-ID: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. ------------- Commit messages: - Add conditional jump aliases; move arrays_equals; add instructions Changes: https://git.openjdk.org/jdk/pull/18893/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330844 Stats: 654 lines in 6 files changed: 439 ins; 215 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18893.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18893/head:pull/18893 PR: https://git.openjdk.org/jdk/pull/18893 From sgibbons at openjdk.org Mon Apr 22 16:27:48 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 16:27:48 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 16:20:39 GMT, Scott Gibbons wrote: > Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. This is a precursor for [JDK-8320448](https://bugs.openjdk.org/browse/JDK-8320448), essentially adding infrastructure requirements for that algorithm. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2070116475 From gli at openjdk.org Mon Apr 22 16:30:39 2024 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 22 Apr 2024 16:30:39 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace Message-ID: Hi all, This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. Best Regards, -- Guoxiong ------------- Commit messages: - JDK-8330155 Changes: https://git.openjdk.org/jdk/pull/18894/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18894&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330155 Stats: 162 lines in 21 files changed: 10 ins; 127 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/18894.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18894/head:pull/18894 PR: https://git.openjdk.org/jdk/pull/18894 From ayang at openjdk.org Mon Apr 22 16:58:28 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 Apr 2024 16:58:28 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 16:24:06 GMT, Guoxiong Li wrote: > Hi all, > > This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. > > The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18894#pullrequestreview-2015334437 From coleenp at openjdk.org Mon Apr 22 17:46:29 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 17:46:29 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: References: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> Message-ID: On Mon, 22 Apr 2024 15:34:53 GMT, Ioi Lam wrote: >> I have to put all this code back to create, store for CDS and use InternalError. > >> Unfortunately, we don't pre-allocate an instance of InternalError. > > But you just need to replaced the lines that you deleted for VirtualMachineError with InternalError. I think that would be cleaner than having to explain a weird behavior in comments. Seems like a waste, but I can put the lines back and make them InternalError. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1575147655 From coleenp at openjdk.org Mon Apr 22 18:10:56 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 18:10:56 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v7] In-Reply-To: References: Message-ID: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - restore exceptions.cpp comment - Restore VirtualMemoryError as InternalError - a nonabstract class ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18847/files - new: https://git.openjdk.org/jdk/pull/18847/files/e4e1f8d3..a70dc567 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=05-06 Stats: 25 lines in 6 files changed: 17 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From coleenp at openjdk.org Mon Apr 22 18:10:56 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 18:10:56 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v5] In-Reply-To: References: <02nN9BVn9G0PSYzT9hfYOcFpKbsOskhalVBC1t8xGFw=.31ad8d16-efcd-426c-a67c-97d2c261e4e3@github.com> Message-ID: On Mon, 22 Apr 2024 17:43:45 GMT, Coleen Phillimore wrote: >>> Unfortunately, we don't pre-allocate an instance of InternalError. >> >> But you just need to replaced the lines that you deleted for VirtualMachineError with InternalError. I think that would be cleaner than having to explain a weird behavior in comments. > > Seems like a waste, but I can put the lines back and make them InternalError. Ok, I made VirtualMachineError into InternalError for this, and restored the CDS code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1575176993 From kvn at openjdk.org Mon Apr 22 18:11:30 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Apr 2024 18:11:30 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory In-Reply-To: References: Message-ID: <_c23YQF9M64MmzaQW4lw3fGc850YkbpVbu1tiXFrA1k=.524aa4b4-3d3d-4891-9a7d-654689ad4f75@github.com> On Mon, 22 Apr 2024 13:48:41 GMT, Scott Gibbons wrote: > Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 1524: > 1522: // UnsafeMemoryAccess page error: continue after ucm > 1523: bool add_entry = !is_oop && (!aligned || sizeof(jlong) == size); > 1524: UnsafeMemoryAccessMark ucmm(this, add_entry, true); May be rename `ucmm` and other related locals too to avoid confusion. Word `ucm` in comments too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18889#discussion_r1575178012 From coleenp at openjdk.org Mon Apr 22 18:19:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 18:19:30 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 03:51:06 GMT, Ioi Lam wrote: > (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) > > HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: > > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 > > Common problems: > - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. > - There's ad-hoc code that deals with `FILE*` differently than from memory. > > This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. > > The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. > > In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) > > Credit: > The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . > > John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). src/hotspot/share/utilities/istream.hpp line 92: > 90: // Do we need to read more input (NTR)? Did we see EOF already? > 91: // Was there an error getting input or allocating buffer space? > 92: enum IState { NTR_STATE, EOF_STATE, ERR_STATE }; Enum class is preferable. src/hotspot/share/utilities/istream.hpp line 374: > 372: : _fs(_private_fs), _private_fs(arg...) > 373: { > 374: } can you put {} on line 372, this doesn't need so much vertical space. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575180789 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575184407 From coleenp at openjdk.org Mon Apr 22 18:19:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 18:19:30 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:15:21 GMT, Coleen Phillimore wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > src/hotspot/share/utilities/istream.hpp line 374: > >> 372: : _fs(_private_fs), _private_fs(arg...) >> 373: { >> 374: } > > can you put {} on line 372, this doesn't need so much vertical space. Is this ... in template typename parameters allowed by our coding standard (whatever it means?) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575186025 From sgibbons at openjdk.org Mon Apr 22 18:23:40 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 18:23:40 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v2] In-Reply-To: References: Message-ID: > Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Address review comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18889/files - new: https://git.openjdk.org/jdk/pull/18889/files/281a1da9..aaa3d416 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18889&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18889&range=00-01 Stats: 67 lines in 6 files changed: 0 ins; 0 del; 67 mod Patch: https://git.openjdk.org/jdk/pull/18889.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18889/head:pull/18889 PR: https://git.openjdk.org/jdk/pull/18889 From sgibbons at openjdk.org Mon Apr 22 18:23:40 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 18:23:40 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v2] In-Reply-To: <_c23YQF9M64MmzaQW4lw3fGc850YkbpVbu1tiXFrA1k=.524aa4b4-3d3d-4891-9a7d-654689ad4f75@github.com> References: <_c23YQF9M64MmzaQW4lw3fGc850YkbpVbu1tiXFrA1k=.524aa4b4-3d3d-4891-9a7d-654689ad4f75@github.com> Message-ID: On Mon, 22 Apr 2024 18:09:14 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comment > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 1524: > >> 1522: // UnsafeMemoryAccess page error: continue after ucm >> 1523: bool add_entry = !is_oop && (!aligned || sizeof(jlong) == size); >> 1524: UnsafeMemoryAccessMark ucmm(this, add_entry, true); > > May be rename `ucmm` and other related locals too to avoid confusion. Word `ucm` in comments too. Done. Comment says `unsafe access` instead of `ucm` and `umam` instead of `ucmm`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18889#discussion_r1575191505 From jrose at openjdk.org Mon Apr 22 18:31:30 2024 From: jrose at openjdk.org (John R Rose) Date: Mon, 22 Apr 2024 18:31:30 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:16:20 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/istream.hpp line 374: >> >>> 372: : _fs(_private_fs), _private_fs(arg...) >>> 373: { >>> 374: } >> >> can you put {} on line 372, this doesn't need so much vertical space. > > Is this ... in template typename parameters allowed by our coding standard (whatever it means?) Yes. It?s called ?variadic templates?; see `doc/hotspot-style.md`. The usage appears twice already, in `metaprogramming/logical.hpp` and `asm/register.hpp`. The usage here is pretty straightforward IMO: It just forwards constructor argument lists unchanged. This is useful for wrappers (where you construct the wrapped thing as a field) or subclass factoring (where you construct the super). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575200336 From kvn at openjdk.org Mon Apr 22 18:37:31 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Apr 2024 18:37:31 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 16:20:39 GMT, Scott Gibbons wrote: > making arrays_equals accessible from stubs I am not sure I understand why you need to move it. Your changes for JDK-8320448 shows that new code is used only by C2. You can move your new code in stubGenerator_x86_64.cpp into the part under `#ifdef COMPILER2`. And code in `stubGenerator_x86_64_string.cpp` could be put under this `#ifdef` too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2070578582 From sviswanathan at openjdk.org Mon Apr 22 18:37:34 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 22 Apr 2024 18:37:34 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v2] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:23:40 GMT, Scott Gibbons wrote: >> Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Address review comment src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1627: > 1625: { > 1626: // Add set memory mark to protect against unsafe accesses faulting > 1627: UnsafeMemoryAccessMark usmm(this, ((t == T_BYTE) && !aligned), true); usmm -> umam to be consistent. src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2631: > 2629: { > 2630: Label L_wordsTail, L_wordsLoop, L_wordsTailLoop; > 2631: UnsafeMemoryAccessMark usmm(this, true, true); usmm -> umam src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2645: > 2643: { > 2644: Label L_qwordLoop, L_qwordsTail, L_qwordsTailLoop; > 2645: UnsafeMemoryAccessMark usmm(this, true, true); usmm -> umam src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2662: > 2660: { > 2661: Label L_dwordLoop, L_dwordsTail, L_dwordsTailLoop; > 2662: UnsafeMemoryAccessMark usmm(this, true, true); usmm -> umam ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18889#discussion_r1575205497 PR Review Comment: https://git.openjdk.org/jdk/pull/18889#discussion_r1575205741 PR Review Comment: https://git.openjdk.org/jdk/pull/18889#discussion_r1575205908 PR Review Comment: https://git.openjdk.org/jdk/pull/18889#discussion_r1575205988 From sgibbons at openjdk.org Mon Apr 22 18:41:53 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 18:41:53 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v3] In-Reply-To: References: Message-ID: > Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Missed a couple ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18889/files - new: https://git.openjdk.org/jdk/pull/18889/files/aaa3d416..e8b86eee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18889&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18889&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18889.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18889/head:pull/18889 PR: https://git.openjdk.org/jdk/pull/18889 From sgibbons at openjdk.org Mon Apr 22 18:41:53 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 18:41:53 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v2] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:23:40 GMT, Scott Gibbons wrote: >> Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Address review comment `usmm` => `umam` for consistency. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18889#issuecomment-2070593309 From dnsimon at openjdk.org Mon Apr 22 18:58:37 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 22 Apr 2024 18:58:37 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: <3udXwUbOsBIAaNZt36nDchiLeLvzHoe6LPYc9R4oLTQ=.c3574196-a3fb-47e9-b5f6-84514c184916@github.com> On Wed, 10 Apr 2024 17:47:36 GMT, Tom Rodriguez wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Addressing Ivanov's PR feedback. > > src/hotspot/share/opto/escape.cpp line 560: > >> 558: const Type* cast_t = _igvn->type(use); >> 559: if (cast_t == nullptr || cast_t->make_ptr()->isa_instptr() == nullptr) { >> 560: NOT_PRODUCT(use->dump();) > > This dump should be guarded by TraceReduceAllocationMerges as should the one at line 574 I opened https://bugs.openjdk.org/browse/JDK-8330850 for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1575229245 From kvn at openjdk.org Mon Apr 22 19:18:29 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Apr 2024 19:18:29 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v3] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:41:53 GMT, Scott Gibbons wrote: >> Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Missed a couple Looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18889#pullrequestreview-2015617576 From sgibbons at openjdk.org Mon Apr 22 19:18:29 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 19:18:29 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v3] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:41:53 GMT, Scott Gibbons wrote: >> Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Missed a couple Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18889#issuecomment-2070719119 From never at openjdk.org Mon Apr 22 19:18:37 2024 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 22 Apr 2024 19:18:37 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: <3udXwUbOsBIAaNZt36nDchiLeLvzHoe6LPYc9R4oLTQ=.c3574196-a3fb-47e9-b5f6-84514c184916@github.com> References: <3udXwUbOsBIAaNZt36nDchiLeLvzHoe6LPYc9R4oLTQ=.c3574196-a3fb-47e9-b5f6-84514c184916@github.com> Message-ID: <_bfVUuPblUzxxhjy1JV-EdgpuIwyMyft2mpfvJwrIfQ=.34e5360d-2fb1-4b43-a614-f239fc5f1413@github.com> On Mon, 22 Apr 2024 18:55:40 GMT, Doug Simon wrote: >> src/hotspot/share/opto/escape.cpp line 560: >> >>> 558: const Type* cast_t = _igvn->type(use); >>> 559: if (cast_t == nullptr || cast_t->make_ptr()->isa_instptr() == nullptr) { >>> 560: NOT_PRODUCT(use->dump();) >> >> This dump should be guarded by TraceReduceAllocationMerges as should the one at line 574 > > I opened https://bugs.openjdk.org/browse/JDK-8330850 for this. Sorry I didn't include this here but I filed https://bugs.openjdk.org/browse/JDK-8330277 for it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1575250288 From cjplummer at openjdk.org Mon Apr 22 19:56:27 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 22 Apr 2024 19:56:27 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 16:24:06 GMT, Guoxiong Li wrote: > Hi all, > > This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. > > The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong SA changes look good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18894#pullrequestreview-2015689218 From sviswanathan at openjdk.org Mon Apr 22 19:57:31 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 22 Apr 2024 19:57:31 GMT Subject: RFR: 8330821: Rename UnsafeCopyMemory [v3] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:41:53 GMT, Scott Gibbons wrote: >> Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Missed a couple Looks good. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18889#pullrequestreview-2015690658 From iklam at openjdk.org Mon Apr 22 20:00:28 2024 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 22 Apr 2024 20:00:28 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v7] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:10:56 GMT, Coleen Phillimore wrote: >> It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - restore exceptions.cpp comment > - Restore VirtualMemoryError as InternalError - a nonabstract class Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18847#pullrequestreview-2015695532 From gziemski at openjdk.org Mon Apr 22 20:27:38 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Mon, 22 Apr 2024 20:27:38 GMT Subject: RFR: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE [v2] In-Reply-To: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> References: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> Message-ID: > This is a 3rd attempt of the same fix: > > 1st one had to be pulled out because of a bug in zsh > 2nd one had a workaround for the bug in zsh, but then uncovered an issue in JWDP (JDK-8324668), which was subsequently fixed. > > Tested with MACH5 tier1-9 with no unique or new failures on macOS Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: reduce number of changes for easier backports ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18821/files - new: https://git.openjdk.org/jdk/pull/18821/files/1376278e..6502f845 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18821&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18821&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18821.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18821/head:pull/18821 PR: https://git.openjdk.org/jdk/pull/18821 From gziemski at openjdk.org Mon Apr 22 20:27:39 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Mon, 22 Apr 2024 20:27:39 GMT Subject: RFR: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE [v2] In-Reply-To: References: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> Message-ID: On Sat, 20 Apr 2024 14:38:33 GMT, Daniel D. Daugherty wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> reduce number of changes for easier backports > > src/hotspot/os/bsd/os_bsd.cpp line 2136: > >> 2134: >> 2135: if (MaxFDLimit) { >> 2136: // Set the number of file descriptors to max. print out error > > You dropped the other editorial fixes, but only kept part of this one. In the previous > version of patch (8300088), you also fixed: s/print/Print/ > > I would drop this editorial fix also. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18821#discussion_r1575318390 From gziemski at openjdk.org Mon Apr 22 20:30:30 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Mon, 22 Apr 2024 20:30:30 GMT Subject: RFR: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE [v2] In-Reply-To: References: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> Message-ID: On Sat, 20 Apr 2024 14:41:20 GMT, Daniel D. Daugherty wrote: >> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision: >> >> reduce number of changes for easier backports > > I compared this patch to the previous patch (8300088) and it is the > same in the core part of the fix. All but one of the editorial changes > from 8300088 have been dropped which is good for a backport. > > It would be good if you revived all of the editorial fixes from 8300088 > and integrated them into the main line using a separate RFE. > > Thanks for documenting your testing. @dcubed-ojdk @dholmes-ora re-requested reviews. For such a trivial change (1 letter from capital to small in a comment) not sure we need it, but it's a protocol, so I'm going to wait for your re-reviews. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18821#issuecomment-2070895200 From sgibbons at openjdk.org Mon Apr 22 20:47:52 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 20:47:52 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: > Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Undo move of arrays_equals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18893/files - new: https://git.openjdk.org/jdk/pull/18893/files/74d47302..0b95b3af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=00-01 Stats: 564 lines in 4 files changed: 282 ins; 282 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18893.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18893/head:pull/18893 PR: https://git.openjdk.org/jdk/pull/18893 From jsjolen at openjdk.org Mon Apr 22 20:50:29 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Apr 2024 20:50:29 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: <7_vqERb2aC7gU-sSV3vmxJEgrYd3KuhncMLga4rP6rE=.ff63cadf-6764-459b-8764-4ceb66611ec4@github.com> On Mon, 22 Apr 2024 18:11:56 GMT, Coleen Phillimore wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > src/hotspot/share/utilities/istream.hpp line 92: > >> 90: // Do we need to read more input (NTR)? Did we see EOF already? >> 91: // Was there an error getting input or allocating buffer space? >> 92: enum IState { NTR_STATE, EOF_STATE, ERR_STATE }; > > Enum class is preferable. This is a nit, but if enum class is used can we then also change to PascalCase for the enum cases? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575341532 From sgibbons at openjdk.org Mon Apr 22 20:51:28 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 20:51:28 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 20:47:52 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Undo move of arrays_equals Adding the `#ifdef COMPILER2` in `stubGenerator_x86_64_string.cpp` allows for good compilation for JDK-8320448, so I can undo the move. Thanks for spotting that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2070926706 From sgibbons at openjdk.org Mon Apr 22 21:00:45 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 21:00:45 GMT Subject: RFR: 8320448: Accelerate IndexOf using AVX2 [v17] In-Reply-To: References: Message-ID: <05mD2dSduIgyzdnDqUHlh6CEqjWDkJ3wa_XK58tJy4Y=.a1d85af3-17e1-471f-a665-66d0693fda25@github.com> > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Move arrays_equals back to c2_MacroAssembler ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16753/files - new: https://git.openjdk.org/jdk/pull/16753/files/8e0ce70a..1d141fde Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=15-16 Stats: 576 lines in 5 files changed: 288 ins; 282 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From matsaave at openjdk.org Mon Apr 22 21:31:28 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 22 Apr 2024 21:31:28 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: <13Ig3ZYmT_MXs2Ok6K2ecyirPfenLJAbdHQZ3EA6gG0=.b3cbf91a-5e9a-4063-8c89-06d1e54db001@github.com> On Thu, 18 Apr 2024 03:51:06 GMT, Ioi Lam wrote: > (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) > > HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: > > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 > > Common problems: > - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. > - There's ad-hoc code that deals with `FILE*` differently than from memory. > > This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. > > The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. > > In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) > > Credit: > The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . > > John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). I have minor comments and considerations but otherwise, LGTM! src/hotspot/share/utilities/istream.hpp line 108: > 106: void* _must_free; // unless null, a malloc pointer which we must free > 107: size_t _line_count; // increasing non-resettable count of lines read > 108: char _small_buffer[SMALL_SIZE]; // buffer for holding lines maybe this should be called line_buffer instead? test/hotspot/gtest/utilities/test_istream.cpp line 290: > 288: } > 289: for (int lelen = 1; lelen <= 2; lelen++) { // try both kinds of newline > 290: //if (ncols > 0) ncols = (ncols == 1) ? (2*patlen)/3 : patlen; Leftover comment? ------------- Marked as reviewed by matsaave (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18833#pullrequestreview-2015691626 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575342897 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575369159 From matsaave at openjdk.org Mon Apr 22 21:31:29 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 22 Apr 2024 21:31:29 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: <6M6ejuQG1HdfbGptBf1hBrBmyvYl8AMI3WYi5eVeiMo=.3c6d3278-3d33-40f5-8586-c0c241e97186@github.com> On Thu, 18 Apr 2024 11:59:51 GMT, Coleen Phillimore wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > src/hotspot/share/cds/classListParser.cpp line 436: > >> 434: } >> 435: >> 436: void ClassListParser::check_class_name(const char* class_name) { > > Did we not already have code to check the length of the class name for the class list parser? There's similar code in systemDictionary. There is `SystemDictionary::class_name_symbol()` which checks if a class name is valid but it creates a new symbol instead of solely checking the name. Maybe it's worth abstracting the check into its own method and using it here and in SystemDictionary? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575291232 From gziemski at openjdk.org Mon Apr 22 21:32:32 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Mon, 22 Apr 2024 21:32:32 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 16:00:51 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Remove faulty condition after removing merging > - Add failing test case Taking a look... ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2070985166 From dlong at openjdk.org Mon Apr 22 21:36:32 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 22 Apr 2024 21:36:32 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Thu, 18 Apr 2024 19:16:32 GMT, Patricio Chilano Mateo wrote: >> src/hotspot/share/runtime/deoptimization.cpp line 451: >> >>> 449: if (!is_syncronized_entry && bc != Bytecodes::Code::_monitorenter) { >>> 450: deoptee_thread->lock_stack().verify_consistent_lock_order(lock_order, exec_mode != Deoptimization::Unpack_none); >>> 451: } >> >> The above checks would also hit the follow false positives: >> 1. deopt in counter overflow in prologue, not in monitorenter >> 2. monitorenter at bci 0 when raw_bci is -1 (assuming it got past the verifier) >> but seems mostly harmless to skip checks in those cases. > > I thought the original check was fine. Could you elaborate on these 2 cases, I didn't really get them. Deoptimization is already expensive, and this edge case is rare, so I think it would be better to compute the actual previous bytecode here, and not use bci - 1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1575382083 From kvn at openjdk.org Mon Apr 22 21:37:29 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Apr 2024 21:37:29 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 20:47:52 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Undo move of arrays_equals Can you also remove changes in `arrays_equals` from this PR? It is fine to have them in JDK-8320448 changes. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4503: > 4501: > 4502: assert((!expand_ary2) || ((expand_ary2) && (UseAVX == 2)), > 4503: "Expansion only implemented for AVX2"); BTW, the check in assert could be simplified: `(!expand_ary2 || UseAVX == 2)` ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2070990618 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1575383066 From coleenp at openjdk.org Mon Apr 22 21:38:28 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 21:38:28 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding [v2] In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 16:22:31 GMT, Matias Saavedra Silva wrote: >> Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. >> >> The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. The changes show no issues when tested against libgraal. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Dean and Gilles comments Still good! ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18819#pullrequestreview-2015844122 From dlong at openjdk.org Mon Apr 22 21:41:31 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 22 Apr 2024 21:41:31 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Mon, 22 Apr 2024 21:33:49 GMT, Dean Long wrote: >> I thought the original check was fine. Could you elaborate on these 2 cases, I didn't really get them. > > Deoptimization is already expensive, and this edge case is rare, so I think it would be better to compute the actual previous bytecode here, and not use bci - 1. Other debug checks in deoptimization compute oop maps, which have to iterate all the bytecodes, so doing it here also wouldn't be so bad. Or how about not checking bytecodes and instead checking a flag on JavaThread that says we are in monitor enter native code? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1575385975 From sgibbons at openjdk.org Mon Apr 22 21:45:28 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 21:45:28 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: <1aaBuJuA_GcJXid7SX-6ZJzw-KJQS-_yB5xMKcHawYQ=.9c4fa086-94d9-48e0-90b6-6b3401bc4f2b@github.com> On Mon, 22 Apr 2024 20:47:52 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Undo move of arrays_equals A large part of this PR was to lessen the burden of reviewing JDK-8320448 changes. Am I hearing you say that this approach is not desired? The other PR is a big review and I was hoping to piecemeal some non-core algorithm changes in to make the review easier. It is, of course, trivial to revert the change to arrays_equals. Please let me know your final decision. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2071001776 From jsjolen at openjdk.org Mon Apr 22 21:45:30 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 22 Apr 2024 21:45:30 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> On Thu, 18 Apr 2024 03:51:06 GMT, Ioi Lam wrote: > (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) > > HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: > > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 > > Common problems: > - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. > - There's ad-hoc code that deals with `FILE*` differently than from memory. > > This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. > > The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. > > In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) > > Credit: > The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . > > John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). Hi, The general concept of an `inputStream` is very good and I think this is useful, so I'm very happy that this is being added to Hotspot. There's a design issue that I would like to be fixed, however. Consider the interface that `inputStream::Input` exposes, specifically the `close` operation: ```c++ // Rewind so that the position appears to be the given one. // Return the new position, or else (size_t)-1 if the request fails. virtual size_t set_position(size_t position) { return -1; } // If it is backed by a resource that needs closing, do so. virtual void close() { } This function: ```c++ void inputStream::set_input(inputStream::Input* input) { clear_buffer(); if (_input != nullptr && _input != input) { _input->close(); } _input = input; _input_state = NTR_STATE; } and finally this: ```c++ // class inputStream Input* _input; // where the input comes from or else nullptr I'd like to see `set_input` be removed, as instead of changing the inputStream a new can be created (and replace the old one, if need be). Second, I'd like to see the `close` method to be removed as `inputStream` no longer has any code calling it. Third, I'd like to see the `_input` field to go from being a pointer to being a reference as there is no point in an `inputStream` not having an input, and that input should not change. Closing the resource is the responsibility of the owner, and this preferably happens in its destructor. If an explicit close method is required, then the owner will know what that method is, as the input object will be "concrete" (such as `FileInput`). In `MemoryInput` `close()`:ing the resource ought to be the same as `free()`:ing it, but here that is ignored and `close()` seems unimplemented? Consider the following code, I'd be very surprised at this: ```c++ { FileInput fi("myfile"); { inputStream(&fi); // ... Let's say I figure out that the file has at least 8 bytes } char bytearr[8]; fi.set_position(0); int read_bytes = fi.read(&bytearr, 8); // read_bytes is 0 !??! } More generally, the class `inputStream::Input` should exist to satisfy the needs of `inputStream` and **not** as a general interface for any and all inputs. Any optional method cannot be depended upon anyway, so two paths will have to be written if any function takes an `inputStream::Input&` as input. I'd like to call upon "YAGNI" and "KISS" here, as I believe we're over complicating it without any actual need for the complexity. src/hotspot/share/utilities/istream.hpp line 150: > 148: assert(_buffer_size == 0 || _next <= _buffer_size, ""); > 149: return true; > 150: } Please add message, even if only "invariant". src/hotspot/share/utilities/istream.hpp line 207: > 205: const_cast(this)->fill_buffer(); > 206: } > 207: } Why `const_cast` and assign this method `const` when it clearly is not? This shouldn't be `const`, is my point. src/hotspot/share/utilities/istream.hpp line 249: > 247: > 248: // Discards any previous input and sets the given input source. > 249: void set_input(Input* input); Why is being able to change the source input important? I'd think you should just create a new `inputStream`. This would then let us make the `_input` field a reference instead. ------------- PR Review: https://git.openjdk.org/jdk/pull/18833#pullrequestreview-2015784011 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575345973 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575353275 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575350511 From sgibbons at openjdk.org Mon Apr 22 21:45:29 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 21:45:29 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: <8JGuV7337PZtod10hBOBzADjhnGvHDEka7_L2a_KUio=.a7ddf05b-6a4b-4178-9d3f-33d060872f28@github.com> On Mon, 22 Apr 2024 21:35:05 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Undo move of arrays_equals > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4503: > >> 4501: >> 4502: assert((!expand_ary2) || ((expand_ary2) && (UseAVX == 2)), >> 4503: "Expansion only implemented for AVX2"); > > BTW, the check in assert could be simplified: `(!expand_ary2 || UseAVX == 2)` I thought this would make the intent explicitly clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1575388290 From kvn at openjdk.org Mon Apr 22 21:55:28 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Apr 2024 21:55:28 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: <1aaBuJuA_GcJXid7SX-6ZJzw-KJQS-_yB5xMKcHawYQ=.9c4fa086-94d9-48e0-90b6-6b3401bc4f2b@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> <1aaBuJuA_GcJXid7SX-6ZJzw-KJQS-_yB5xMKcHawYQ=.9c4fa086-94d9-48e0-90b6-6b3401bc4f2b@github.com> Message-ID: On Mon, 22 Apr 2024 21:42:27 GMT, Scott Gibbons wrote: > A large part of this PR was to lessen the burden of reviewing JDK-8320448 changes. Am I hearing you say that this approach is not desired? The other PR is a big review and I was hoping to piecemeal some non-core algorithm changes in to make the review easier. > > It is, of course, trivial to revert the change to arrays_equals. Please let me know your final decision. Thanks. I am for splitting big PRs if possible. And you are not limited how many self-containing sub-PRs you can create. But each PR should address one issue for easy review and testing. I consider this PR should address what in its title: aliases for jump instructions and adding missing cmp/jump instructions (which is related). Any changes to not related code, like arrays_equals, do not belong here. It could be separate sub-PR or even followup PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2071014632 From dlong at openjdk.org Mon Apr 22 22:00:29 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 22 Apr 2024 22:00:29 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v7] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 18:10:56 GMT, Coleen Phillimore wrote: >> It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - restore exceptions.cpp comment > - Restore VirtualMemoryError as InternalError - a nonabstract class src/hotspot/share/classfile/verifier.cpp line 258: > 256: // to infinitely recurse when we try to initialize the exception. > 257: // So bail out here by throwing the preallocated VM error. > 258: THROW_OOP_(Universe::class_init_stack_overflow_error(), false); Should this be InternalError now? That seems better than StackOverflow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1575400037 From jrose at openjdk.org Mon Apr 22 22:06:28 2024 From: jrose at openjdk.org (John R Rose) Date: Mon, 22 Apr 2024 22:06:28 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: <13Ig3ZYmT_MXs2Ok6K2ecyirPfenLJAbdHQZ3EA6gG0=.b3cbf91a-5e9a-4063-8c89-06d1e54db001@github.com> References: <13Ig3ZYmT_MXs2Ok6K2ecyirPfenLJAbdHQZ3EA6gG0=.b3cbf91a-5e9a-4063-8c89-06d1e54db001@github.com> Message-ID: On Mon, 22 Apr 2024 20:48:58 GMT, Matias Saavedra Silva wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > src/hotspot/share/utilities/istream.hpp line 108: > >> 106: void* _must_free; // unless null, a malloc pointer which we must free >> 107: size_t _line_count; // increasing non-resettable count of lines read >> 108: char _small_buffer[SMALL_SIZE]; // buffer for holding lines > > maybe this should be called line_buffer instead? No, it?s the small buffer that is the initial estimate of the line buffer, which in general must grow by heap allocation. (BTW, the presence of small_buffer is the reason your other suggestion about `set_input` is wrong; will explain there?) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575403639 From jrose at openjdk.org Mon Apr 22 22:06:29 2024 From: jrose at openjdk.org (John R Rose) Date: Mon, 22 Apr 2024 22:06:29 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> Message-ID: On Mon, 22 Apr 2024 20:52:13 GMT, Johan Sj?len wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > src/hotspot/share/utilities/istream.hpp line 150: > >> 148: assert(_buffer_size == 0 || _next <= _buffer_size, ""); >> 149: return true; >> 150: } > > Please add message, even if only "invariant". No, that?s not necessary. There are many, many empty assert strings in HotSpot. If there?s no message, it means ?check the code logic here?. You don?t need to say ?invariant? or ?sanity? or ?must be? as a redundant means of conveying that message. Although, some authors do this. But if there are a long string of asserts, saying ?invariant? that many times is simply noise. > src/hotspot/share/utilities/istream.hpp line 207: > >> 205: const_cast(this)->fill_buffer(); >> 206: } >> 207: } > > Why `const_cast` and assign this method `const` when it clearly is not? This shouldn't be `const`, is my point. The method is const because the logical state of the stream is invariant, as visible to the API user. If the implementation needs an invisible internal state change, it needs a const-cast (or mutable field, sometimes, will work). If `preload` were made non-const as you suggest, we?d need to move the const-casting elsewhere, and it would be less clear that `preload` preserves API-visible state. So the code, as it is, is the most convenient place to put the const-cast, as an internal implementation decision. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575402794 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575401801 From sgibbons at openjdk.org Mon Apr 22 22:10:56 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 22:10:56 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: > Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Revert changes to arrays_equals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18893/files - new: https://git.openjdk.org/jdk/pull/18893/files/0b95b3af..f7d7f7de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=01-02 Stats: 90 lines in 2 files changed: 0 ins; 67 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/18893.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18893/head:pull/18893 PR: https://git.openjdk.org/jdk/pull/18893 From sgibbons at openjdk.org Mon Apr 22 22:10:56 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 22:10:56 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v2] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 20:47:52 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Undo move of arrays_equals OK. arrays_equals changes reverted. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2071033103 From jrose at openjdk.org Mon Apr 22 22:15:28 2024 From: jrose at openjdk.org (John R Rose) Date: Mon, 22 Apr 2024 22:15:28 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> Message-ID: On Mon, 22 Apr 2024 20:57:05 GMT, Johan Sj?len wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > src/hotspot/share/utilities/istream.hpp line 249: > >> 247: >> 248: // Discards any previous input and sets the given input source. >> 249: void set_input(Input* input); > > Why is being able to change the source input important? I'd think you should just create a new `inputStream`. This would then let us make the `_input` field a reference instead. There may be a better way to design the way streams and inputs connect with each other, but removing `set_input` is a step in the wrong direction. The reason is that a stream is designed to be stack-allocated, while an input source is something that can have a non-stack lifetime. It is supposed to be cheap to create an input stream on stack, and part of the cost model comes from the small buffer, which is also stack allocated, and defers the need for heap allocation. This means it could be a performance problem to ?make a new one? as suggested above. It?s better to allow the stack allocated one to modify its input source, if the input is either (a) multiplexed from different sources, or (b) determined AFTER the input stream is declared. Yes, the normal use is to determine the input source first and then wrap the i-stream around it. But if this fails, it is not the right answer to push the i-stream onto the heap; i-streams are strongly associated with the blocks of code that exercise them. (In HotSpot, something named ?stream? often has such a stack-bound workflow, and there is often something ELSE that is heap-allocated that contains the stream?s data source.) So the right answer, when the input sources are ?hard to manage?, is `set_input`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575408457 From pchilanomate at openjdk.org Mon Apr 22 22:20:49 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 22 Apr 2024 22:20:49 GMT Subject: RFR: 8330817: jdk/internal/vm/Continuation/OSRTest.java times out on libgraal Message-ID: Small test fix to prevent inlining of foo/fooBigFrame. Tested with Graal repo and verified timeout doesn't happen anymore. Thanks, Patricio ------------- Commit messages: - v1 Changes: https://git.openjdk.org/jdk/pull/18905/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18905&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330817 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18905.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18905/head:pull/18905 PR: https://git.openjdk.org/jdk/pull/18905 From kvn at openjdk.org Mon Apr 22 22:33:28 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 22 Apr 2024 22:33:28 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 22:10:56 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to arrays_equals Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18893#pullrequestreview-2015899888 From sgibbons at openjdk.org Mon Apr 22 22:33:28 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 22:33:28 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 22:10:56 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; making arrays_equals accessible from stubs; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to arrays_equals Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2071059232 From coleenp at openjdk.org Mon Apr 22 22:37:28 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 22:37:28 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v7] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 21:58:15 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - restore exceptions.cpp comment >> - Restore VirtualMemoryError as InternalError - a nonabstract class > > src/hotspot/share/classfile/verifier.cpp line 258: > >> 256: // to infinitely recurse when we try to initialize the exception. >> 257: // So bail out here by throwing the preallocated VM error. >> 258: THROW_OOP_(Universe::class_init_stack_overflow_error(), false); > > Should this be InternalError now? That seems better than StackOverflow. Technically it's a stack overflow. I don't think it's a reachable code path so it doesn't really matter. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1575423616 From coleenp at openjdk.org Mon Apr 22 22:43:41 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 22 Apr 2024 22:43:41 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v8] In-Reply-To: References: Message-ID: <_g7eC_1K6WWrGUNHxYyTCDKzSy3Wr06FtEMWWF2Dwrw=.93ab37a4-742d-4502-95de-565e6dad6c01@github.com> > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: It can be InternalError ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18847/files - new: https://git.openjdk.org/jdk/pull/18847/files/a70dc567..0f6c27fc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18847&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18847.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18847/head:pull/18847 PR: https://git.openjdk.org/jdk/pull/18847 From sgibbons at openjdk.org Mon Apr 22 22:57:32 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 22 Apr 2024 22:57:32 GMT Subject: Integrated: 8330821: Rename UnsafeCopyMemory In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 13:48:41 GMT, Scott Gibbons wrote: > Renaming UnsafeCopyMemory to UnsafeMemoryAccess since this class is now being used for Unsafe::setMemory. This is a pure rename only. This pull request has now been integrated. Changeset: 58ad399d Author: Scott Gibbons Committer: Sandhya Viswanathan URL: https://git.openjdk.org/jdk/commit/58ad399d196bf2dd701df451004b7815b0820675 Stats: 159 lines in 18 files changed: 0 ins; 0 del; 159 mod 8330821: Rename UnsafeCopyMemory Reviewed-by: kvn, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/18889 From john.r.rose at oracle.com Mon Apr 22 23:33:43 2024 From: john.r.rose at oracle.com (John Rose) Date: Mon, 22 Apr 2024 16:33:43 -0700 Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: <13Ig3ZYmT_MXs2Ok6K2ecyirPfenLJAbdHQZ3EA6gG0=.b3cbf91a-5e9a-4063-8c89-06d1e54db001@github.com> Message-ID: <1B834CAE-64B7-4DFF-A37F-BBD8E0C213CA@oracle.com> On 22 Apr 2024, at 15:06, John R Rose wrote: > On Mon, 22 Apr 2024 20:48:58 GMT, Matias Saavedra Silva wrote: > >>> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >>> >>> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >>> >>> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >>> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >>> >>> Common problems: >>> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >>> - There's ad-hoc code that deals with `FILE*` differently than from memory. >>> >>> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >>> >>> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >>> >>> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >>> >>> Credit: >>> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >>> >>> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). >> >> src/hotspot/share/utilities/istream.hpp line 108: >> >>> 106: void* _must_free; // unless null, a malloc pointer which we must free >>> 107: size_t _line_count; // increasing non-resettable count of lines read >>> 108: char _small_buffer[SMALL_SIZE]; // buffer for holding lines >> >> maybe this should be called line_buffer instead? > > No, it?s the small buffer that is the initial estimate of the line buffer, which in general must grow by heap allocation. > > (BTW, the presence of small_buffer is the reason your other suggestion about `set_input` is wrong; will explain there?) Correction, that other suggestion wasn?t yours! Also, thanks for the review. From john.r.rose at oracle.com Mon Apr 22 23:40:00 2024 From: john.r.rose at oracle.com (John Rose) Date: Mon, 22 Apr 2024 16:40:00 -0700 Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: Message-ID: <0D5B973C-9B12-4CA6-8633-480EE7749B11@oracle.com> On 18 Apr 2024, at 5:11, Coleen Phillimore wrote: >> 1: /* >> 2: * Copyright (c) 2023, 2024, Oracle and/or its affiliates. All rights reserved. > > These new files should only say 2024 in the copyright. > Yes, that?s right. I did not publicize the proposed changes until this year, and they won?t be committed until this year. FTR they were visible last year, and bore last year?s date in their draft form. But it was only visible if you knew which branch to look at, in my private github repo. From john.r.rose at oracle.com Mon Apr 22 23:49:23 2024 From: john.r.rose at oracle.com (John Rose) Date: Mon, 22 Apr 2024 16:49:23 -0700 Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: <7_vqERb2aC7gU-sSV3vmxJEgrYd3KuhncMLga4rP6rE=.ff63cadf-6764-459b-8764-4ceb66611ec4@github.com> References: <7_vqERb2aC7gU-sSV3vmxJEgrYd3KuhncMLga4rP6rE=.ff63cadf-6764-459b-8764-4ceb66611ec4@github.com> Message-ID: <28AEF4DC-427A-43ED-A2A2-FECE7A08585B@oracle.com> On 22 Apr 2024, at 13:50, Johan Sj?len wrote: > On Mon, 22 Apr 2024 18:11:56 GMT, Coleen Phillimore wrote: > >>> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >>> >>> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >>> >>> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >>> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >>> >>> Common problems: >>> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >>> - There's ad-hoc code that deals with `FILE*` differently than from memory. >>> >>> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >>> >>> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >>> >>> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >>> >>> Credit: >>> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >>> >>> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). >> >> src/hotspot/share/utilities/istream.hpp line 92: >> >>> 90: // Do we need to read more input (NTR)? Did we see EOF already? >>> 91: // Was there an error getting input or allocating buffer space? >>> 92: enum IState { NTR_STATE, EOF_STATE, ERR_STATE }; >> >> Enum class is preferable. > > This is a nit, but if enum class is used can we then also change to PascalCase for the enum cases? If you grep for ?enum class? in our code base, you find a number of UPPER_CASE member names, plus lower_case and _pre_hyphen_lower_case and Capitalized. Not many PascalCase. Am I missing a reason to prefer PascalCase here? The style guide says: > Constant names may be upper-case or mixed-case, according to > historical necessity. (Note: There are many examples of constants > with lowercase names.) I chose UPPER_CASE because the enum members function like global constants, and this is low-level code. I guess my precedent was JavaThreadStatus, which has members named like that. From iklam at openjdk.org Tue Apr 23 00:17:28 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 00:17:28 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: <6M6ejuQG1HdfbGptBf1hBrBmyvYl8AMI3WYi5eVeiMo=.3c6d3278-3d33-40f5-8586-c0c241e97186@github.com> References: <6M6ejuQG1HdfbGptBf1hBrBmyvYl8AMI3WYi5eVeiMo=.3c6d3278-3d33-40f5-8586-c0c241e97186@github.com> Message-ID: On Mon, 22 Apr 2024 19:55:39 GMT, Matias Saavedra Silva wrote: >> src/hotspot/share/cds/classListParser.cpp line 436: >> >>> 434: } >>> 435: >>> 436: void ClassListParser::check_class_name(const char* class_name) { >> >> Did we not already have code to check the length of the class name for the class list parser? There's similar code in systemDictionary. > > There is `SystemDictionary::class_name_symbol()` which checks if a class name is valid but it creates a new symbol instead of solely checking the name. Maybe it's worth abstracting the check into its own method and using it here and in SystemDictionary? `SystemDictionary::class_name_symbol()` assumes that the input is UTF8, but I cannot make that assumption here, as the `class_name` comes from a text file that can have arbitrary content. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575489524 From iklam at openjdk.org Tue Apr 23 00:28:31 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 00:28:31 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: <7_vqERb2aC7gU-sSV3vmxJEgrYd3KuhncMLga4rP6rE=.ff63cadf-6764-459b-8764-4ceb66611ec4@github.com> References: <7_vqERb2aC7gU-sSV3vmxJEgrYd3KuhncMLga4rP6rE=.ff63cadf-6764-459b-8764-4ceb66611ec4@github.com> Message-ID: On Mon, 22 Apr 2024 20:47:31 GMT, Johan Sj?len wrote: >> src/hotspot/share/utilities/istream.hpp line 92: >> >>> 90: // Do we need to read more input (NTR)? Did we see EOF already? >>> 91: // Was there an error getting input or allocating buffer space? >>> 92: enum IState { NTR_STATE, EOF_STATE, ERR_STATE }; >> >> Enum class is preferable. > > This is a nit, but if enum class is used can we then also change to PascalCase for the enum cases? Is there an adopted style for the enumerators? I couldn't find it in the hotspot style guide. There's quite a variation today. E.g.. enum class vmSymbolID : int { NO_SID = 0, ....}; enum class DefaultsLookupMode { find, skip }; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575494922 From iklam at openjdk.org Tue Apr 23 00:34:34 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 00:34:34 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot In-Reply-To: References: <13Ig3ZYmT_MXs2Ok6K2ecyirPfenLJAbdHQZ3EA6gG0=.b3cbf91a-5e9a-4063-8c89-06d1e54db001@github.com> Message-ID: On Mon, 22 Apr 2024 22:03:42 GMT, John R Rose wrote: >> src/hotspot/share/utilities/istream.hpp line 108: >> >>> 106: void* _must_free; // unless null, a malloc pointer which we must free >>> 107: size_t _line_count; // increasing non-resettable count of lines read >>> 108: char _small_buffer[SMALL_SIZE]; // buffer for holding lines >> >> maybe this should be called line_buffer instead? > > No, it?s the small buffer that is the initial estimate of the line buffer, which in general must grow by heap allocation. > > (BTW, the presence of small_buffer is the reason your other suggestion about `set_input` is wrong; will explain there?) I added comments: char _small_buffer[SMALL_SIZE]; // stack-allocated buffer for holding lines; // will switch to C_HEAP allocation when necessary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575497542 From iklam at openjdk.org Tue Apr 23 00:58:03 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 00:58:03 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v2] In-Reply-To: References: Message-ID: > (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) > > HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: > > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 > > Common problems: > - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. > - There's ad-hoc code that deals with `FILE*` differently than from memory. > > This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. > > The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. > > In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) > > Credit: > The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . > > John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: Comments fro @coleenp and @matias9927 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18833/files - new: https://git.openjdk.org/jdk/pull/18833/files/b7e856a9..85402124 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18833&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18833&range=00-01 Stats: 26 lines in 3 files changed: 3 ins; 2 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/18833.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18833/head:pull/18833 PR: https://git.openjdk.org/jdk/pull/18833 From iklam at openjdk.org Tue Apr 23 00:58:04 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 00:58:04 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v2] In-Reply-To: References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> Message-ID: <_H29a3k5H0j1CVZdy6kEc_GlfhVL_aCSrf9x8vxwzL8=.9ae0ed1f-4db3-4e7c-a95a-e7622be81a4c@github.com> On Mon, 22 Apr 2024 22:00:58 GMT, John R Rose wrote: >> src/hotspot/share/utilities/istream.hpp line 207: >> >>> 205: const_cast(this)->fill_buffer(); >>> 206: } >>> 207: } >> >> Why `const_cast` and assign this method `const` when it clearly is not? This shouldn't be `const`, is my point. > > The method is const because the logical state of the stream is invariant, as visible to the API user. If the implementation needs an invisible internal state change, it needs a const-cast (or mutable field, sometimes, will work). If `preload` were made non-const as you suggest, we?d need to move the const-casting elsewhere, and it would be less clear that `preload` preserves API-visible state. So the code, as it is, is the most convenient place to put the const-cast, as an internal implementation decision. This page suggests using the `mutable` keyword for this situation (but `const_cast<>` is acceptable as a last resort) https://isocpp.org/wiki/faq/const-correctness#mutable-data-members However, to use `mutable` in this case, we would probably need to declare every field as `mutable`, this means that the meaning of `const` will be very difficult to understand (at least we can't use the C++ compiler to tell us which function can't (should) be `const` and which function cannot be `const`.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575509200 From iklam at openjdk.org Tue Apr 23 00:58:04 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 00:58:04 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v2] In-Reply-To: <_H29a3k5H0j1CVZdy6kEc_GlfhVL_aCSrf9x8vxwzL8=.9ae0ed1f-4db3-4e7c-a95a-e7622be81a4c@github.com> References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> <_H29a3k5H0j1CVZdy6kEc_GlfhVL_aCSrf9x8vxwzL8=.9ae0ed1f-4db3-4e7c-a95a-e7622be81a4c@github.com> Message-ID: On Tue, 23 Apr 2024 00:50:00 GMT, Ioi Lam wrote: >> The method is const because the logical state of the stream is invariant, as visible to the API user. If the implementation needs an invisible internal state change, it needs a const-cast (or mutable field, sometimes, will work). If `preload` were made non-const as you suggest, we?d need to move the const-casting elsewhere, and it would be less clear that `preload` preserves API-visible state. So the code, as it is, is the most convenient place to put the const-cast, as an internal implementation decision. > > This page suggests using the `mutable` keyword for this situation (but `const_cast<>` is acceptable as a last resort) > https://isocpp.org/wiki/faq/const-correctness#mutable-data-members > > However, to use `mutable` in this case, we would probably need to declare every field as `mutable`, this means that the meaning of `const` will be very difficult to understand (at least we can't use the C++ compiler to tell us which function can't (should) be `const` and which function cannot be `const`.) On a separate note, `preload()` can cause data to be read from the input. For non-seekable input (such as sockets), this doesn't seem like a `const` operation to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575510055 From iklam at openjdk.org Tue Apr 23 01:08:05 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 23 Apr 2024 01:08:05 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: Message-ID: > (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) > > HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: > > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 > - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 > > Common problems: > - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. > - There's ad-hoc code that deals with `FILE*` differently than from memory. > > This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. > > The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. > > In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) > > Credit: > The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . > > John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' of https://github.com/openjdk/jdk into 8330532-improve-line-oriented-text-parsing-in-hotspot - Comments fro @coleenp and @matias9927 - removed more unused code from istream.hpp - Merged ClassFileParser changes from https://github.com/openjdk/jdk/pull/18669 - Removed gtest cases for features removed in the previous commit - Reverted xmlstream.cpp/hpp and removed unused functions from inputStream - fixed builds - Imported @jrose00 changes https://github.com/openjdk/jdk/pull/18773 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18833/files - new: https://git.openjdk.org/jdk/pull/18833/files/85402124..9c10ae56 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18833&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18833&range=01-02 Stats: 50211 lines in 565 files changed: 25164 ins; 22767 del; 2280 mod Patch: https://git.openjdk.org/jdk/pull/18833.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18833/head:pull/18833 PR: https://git.openjdk.org/jdk/pull/18833 From stuefe at openjdk.org Tue Apr 23 05:22:33 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 23 Apr 2024 05:22:33 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v29] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 13:41:35 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Style and copyright fix > >> Right, the refactoring to remove the `friend` declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked. >> >> Last good hash: [7445999](https://github.com/openjdk/jdk/commit/7445999ee296872320f91146e1004026ba1133c7) > > God, sorry. Do as you think is best. > > I plan to look at this PR, but probably it will not be this week. > > Love your commit messages btw. > Hi @tstuefe, > > Cleaned up Treap significantly, it looks way better now! Thanks for the ideas. Good. Will try to take a look on Thursday or Friday. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2071428357 From aboldtch at openjdk.org Tue Apr 23 05:57:34 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Apr 2024 05:57:34 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Mon, 22 Apr 2024 21:39:09 GMT, Dean Long wrote: >> Deoptimization is already expensive, and this edge case is rare, so I think it would be better to compute the actual previous bytecode here, and not use bci - 1. > > Other debug checks in deoptimization compute oop maps, which have to iterate all the bytecodes, so doing it here also wouldn't be so bad. Or how about not checking bytecodes and instead checking a flag on JavaThread that says we are in monitor enter native code? > Deoptimization is already expensive, and this edge case is rare, so I think it would be better to compute the actual previous bytecode here, and not use bci - 1. Alright I will give that a go then, unless we think the second option is more appropriate. > Or how about not checking bytecodes and instead checking a flag on JavaThread that says we are in monitor enter native code? What state would that be? Any current state we setup is after transitioning to VM which seems to late. I guess it would be possible to instrument `SharedRuntime::monitor_enter_helper` with some thread local state we can check. But the interpreter has its own entry point. But I guess we would never reach this point for the interpreter, as there is no such thing as deoptimizing or unpacking interpreted frames. What is the feeling here? What would be more appropriate? Adding some thread local debug state that we set before transitioning to VM in `SharedRuntime::monitor_enter_helper` seems like the most precise solution, but we need to be sure that there are no earlier safepoint polls before this point within the execution of the monitorenter bytecode. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1575676255 From aboldtch at openjdk.org Tue Apr 23 05:57:34 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Apr 2024 05:57:34 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Tue, 23 Apr 2024 05:52:22 GMT, Axel Boldt-Christmas wrote: >> Other debug checks in deoptimization compute oop maps, which have to iterate all the bytecodes, so doing it here also wouldn't be so bad. Or how about not checking bytecodes and instead checking a flag on JavaThread that says we are in monitor enter native code? > >> Deoptimization is already expensive, and this edge case is rare, so I think it would be better to compute the actual previous bytecode here, and not use bci - 1. > > Alright I will give that a go then, unless we think the second option is more appropriate. > >> Or how about not checking bytecodes and instead checking a flag on JavaThread that says we are in monitor enter native code? > > What state would that be? Any current state we setup is after transitioning to VM which seems to late. I guess it would be possible to instrument `SharedRuntime::monitor_enter_helper` with some thread local state we can check. But the interpreter has its own entry point. But I guess we would never reach this point for the interpreter, as there is no such thing as deoptimizing or unpacking interpreted frames. > > > What is the feeling here? What would be more appropriate? Adding some thread local debug state that we set before transitioning to VM in `SharedRuntime::monitor_enter_helper` seems like the most precise solution, but we need to be sure that there are no earlier safepoint polls before this point within the execution of the monitorenter bytecode. I see there is `_pending_monitorenter` but this would only handle synchronized method entry. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1575678279 From jrose at openjdk.org Tue Apr 23 06:11:27 2024 From: jrose at openjdk.org (John R Rose) Date: Tue, 23 Apr 2024 06:11:27 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> <_H29a3k5H0j1CVZdy6kEc_GlfhVL_aCSrf9x8vxwzL8=.9ae0ed1f-4db3-4e7c-a95a-e7622be81a4c@github.com> Message-ID: On Tue, 23 Apr 2024 00:51:54 GMT, Ioi Lam wrote: >> This page suggests using the `mutable` keyword for this situation (but `const_cast<>` is acceptable as a last resort) >> https://isocpp.org/wiki/faq/const-correctness#mutable-data-members >> >> However, to use `mutable` in this case, we would probably need to declare every field as `mutable`, this means that the meaning of `const` will be very difficult to understand (at least we can't use the C++ compiler to tell us which function can't (should) be `const` and which function cannot be `const`.) > > On a separate note, `preload()` can cause data to be read from the input. For non-seekable input (such as sockets), this doesn't seem like a `const` operation to me. I could be wrong about this, but it seems like it?s completely up to the C++ class to define what aspects of its implementation are fixed for its `const` methods. The `const` keyword does not mean ?everything is immutable about this, at all levels and inside all boundaries?. But then it?s up to the class author to choose which boundary `const` applies to. The thing wrapped inside the i-stream is, I will claim, implementation which can change state even in `const` functions. If not, the alternative is to have almost no `const` functions at all in the i-stream class. That?s less useful, because `const` means something very useful, in the context of the i-stream class. It means that the line will not shift. You can keep on reading, and even writing, the line buffer, as long as you call only `const` functions. The reuse of the line buffer, in this way, is a core part of the i-stream design, and part of its value proposition: You don?t get a new allocation on every read-line op (as you would with Java). But in order to draw bright lines around THAT very useful notion of invariance (during which line buffers are reusable), you NEED `preload` to be const, even if it does internal book keeping, even if it consults the input source at times. In short, a rigid idea of `const` is inconsistent with the carefully balanced performance characteristics of this design. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575689215 From dholmes at openjdk.org Tue Apr 23 06:34:35 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 23 Apr 2024 06:34:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. This is a big change, but the pattern of the changes is quite easy to follow. I do have a couple of queries below. Thanks src/hotspot/share/cds/metaspaceShared.cpp line 1332: > 1330: // NMT: fix up the space tags > 1331: MemTracker::record_virtual_memory_type(archive_space_rs.base(), mtClassShared); > 1332: MemTracker::record_virtual_memory_type(class_space_rs.base(), mtClass); I assumed these (and others) were removed because the `MemTracker` updates had been pushed down into `ReserveSpace` itself, but I can't find them there - what am I missing? src/hotspot/share/gc/parallel/mutableSpace.cpp line 63: > 61: if (clear_space) { > 62: // Prefer page reallocation to migration. > 63: os::free_memory((char*)start, size, page_size, mtGC); Not at all obvious where the corresponding allocation sets the type as mtGC. ?? ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2016320972 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575693287 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575697729 From dholmes at openjdk.org Tue Apr 23 06:34:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 23 Apr 2024 06:34:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Wed, 17 Apr 2024 12:38:23 GMT, Stefan Karlsson wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/share/nmt/virtualMemoryTracker.cpp line 460: > >> 458: assert(_reserved_regions != nullptr, "Sanity check"); >> 459: >> 460: ReservedMemoryRegion rgn(addr, size, flag); > > I'm not sure about this. `rgn` is just used to find the memory region we want to uncommit. The flag isn't used in the search, and passing it forces the callers to also pass in the flag. > > I understand that this happens after the request to remove the mtNone default value. Is there a way that allows us to skip using mtNone, but still don't have to unnecessarily provide a flag? > > Maybe we could create a helper function `ReservedMemoryRegion rgn = ReservedMemoryRegion::create_find_key(addr, size)`, which sets up a ReserveMemoryRegion with mtNone? Was this comment from Stefan addressed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575707097 From stuefe at openjdk.org Tue Apr 23 06:49:35 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 23 Apr 2024 06:49:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 06:18:14 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > src/hotspot/share/gc/parallel/mutableSpace.cpp line 63: > >> 61: if (clear_space) { >> 62: // Prefer page reallocation to migration. >> 63: os::free_memory((char*)start, size, page_size, mtGC); > > Not at all obvious where the corresponding allocation sets the type as mtGC. ?? We don't, and I am not sure this is right. AFAICS this API is used on java heap, in ParallelGC. So, that should be mtJavaHeap, I think. Note that I would like and plan to simplify this API, if possible remove both the page size and the NMT flags. See https://bugs.openjdk.org/browse/JDK-8330144. (the tricky part is to make sure the proposed Linux alternative works with large pages and on old enough kernels) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575726805 From shade at openjdk.org Tue Apr 23 07:20:51 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Apr 2024 07:20:51 GMT Subject: RFR: 8323497: On x64, use 32-bit immediate moves for narrow klass base if possible [v2] In-Reply-To: <_68ECcw_OokuD26uZFKcvye35P8Y8uMqyS8GNQ8iRNs=.cd727de1-f0ed-47ce-8b2e-0957dfa8f390@github.com> References: <_68ECcw_OokuD26uZFKcvye35P8Y8uMqyS8GNQ8iRNs=.cd727de1-f0ed-47ce-8b2e-0957dfa8f390@github.com> Message-ID: On Tue, 20 Feb 2024 06:28:20 GMT, Thomas Stuefe wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into use-32bit-immediate-moves-on-x64-for-klass-encoding-base >> - remove obsolete comment >> - use-32bit-immediate-moves-on-x64-for-klass-encoding-base > > @shipilev , @merykitty ? Could you please review? @tstuefe, do you want to restart this? Code density is important :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17340#issuecomment-2071598178 From stuefe at openjdk.org Tue Apr 23 07:21:49 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 23 Apr 2024 07:21:49 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Mon, 22 Apr 2024 12:51:33 GMT, Johan Sj?len wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > src/hotspot/share/memory/virtualspace.cpp line 45: > >> 43: // Dummy constructor >> 44: ReservedSpace::ReservedSpace() : _base(nullptr), _size(0), _noaccess_prefix(0), >> 45: _alignment(0), _fd_for_heap(-1), _special(false), _executable(false), _nmt_flag(mtNone) { > > Isn't just `_flag` or `_memflag` sufficient as a name for `ReservedSpace`? We don' use `nmt_flag` anywhere else in the codebase. Yes, I would keep consistency with existing code, and maybe later rename all in one followup change ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575764827 From stuefe at openjdk.org Tue Apr 23 07:21:48 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 23 Apr 2024 07:21:48 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. src/hotspot/share/memory/metaspace/testHelpers.cpp line 81: > 79: if (reserve_limit > 0) { > 80: // have reserve limit -> non-expandable context > 81: _rs = ReservedSpace(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(), os::vm_page_size(), mtMetaspace); I would make this mtTest. This should not increase the metaspace counters in NMT src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 112: > 110: > 111: // Commit... > 112: if (os::commit_memory((char*)p, word_size * BytesPerWord, !ExecMem, mtMetaspace) == false) { Not sure if I suggested something different in my first review, but thinking this over, this is wrong. Please don't hardwire mtMetaspace; take the flag from the ReservedSpace member of VirtualSpaceNode. The reason is that metaspace can be used for at least two different flags, and may later be expanded for more. src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 191: > 189: > 190: // Uncommit... > 191: if (os::uncommit_memory((char*)p, word_size * BytesPerWord, !ExecMem, mtMetaspace) == false) { Same here. src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 262: > 260: vm_exit_out_of_memory(word_size * BytesPerWord, OOM_MMAP_ERROR, "Failed to reserve memory for metaspace"); > 261: } > 262: MemTracker::record_virtual_memory_type(rs.base(), mtMetaspace); Looking at this, I don't particularly like it, but it is pre-existing. The fact that we hard-wire mtMetaspace works now relies on the fact that mtClass and mtMetaspace (as of now, the only two flags that are being used) are using different allocation paths. Long term we should change this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575752104 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575747557 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575747677 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575763725 From jsjolen at openjdk.org Tue Apr 23 07:43:33 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 07:43:33 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> Message-ID: On Mon, 22 Apr 2024 22:02:29 GMT, John R Rose wrote: >> src/hotspot/share/utilities/istream.hpp line 150: >> >>> 148: assert(_buffer_size == 0 || _next <= _buffer_size, ""); >>> 149: return true; >>> 150: } >> >> Please add message, even if only "invariant". > > No, that?s not necessary. There are many, many empty assert strings in HotSpot. If there?s no message, it means ?check the code logic here?. You don?t need to say ?invariant? or ?sanity? or ?must be? as a redundant means of conveying that message. Although, some authors do this. But if there are a long string of asserts, saying ?invariant? that many times is simply noise. I was not aware of that, thanks! Let's skip this change then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1575793996 From stefank at openjdk.org Tue Apr 23 07:49:39 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 23 Apr 2024 07:49:39 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. Changes requested by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2016503659 From stefank at openjdk.org Tue Apr 23 07:49:40 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 23 Apr 2024 07:49:40 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Tue, 23 Apr 2024 06:28:31 GMT, David Holmes wrote: >> src/hotspot/share/nmt/virtualMemoryTracker.cpp line 460: >> >>> 458: assert(_reserved_regions != nullptr, "Sanity check"); >>> 459: >>> 460: ReservedMemoryRegion rgn(addr, size, flag); >> >> I'm not sure about this. `rgn` is just used to find the memory region we want to uncommit. The flag isn't used in the search, and passing it forces the callers to also pass in the flag. >> >> I understand that this happens after the request to remove the mtNone default value. Is there a way that allows us to skip using mtNone, but still don't have to unnecessarily provide a flag? >> >> Maybe we could create a helper function `ReservedMemoryRegion rgn = ReservedMemoryRegion::create_find_key(addr, size)`, which sets up a ReserveMemoryRegion with mtNone? > > Was this comment from Stefan addressed? David is right, this comment wasn't addressed. The code here went back and forth and we settled on hiding `ReservedMemoryRegion(address base, size_t size)` in a separate RFE. This means we probably should revert the usage of `flag` here and all the places that passes down `flag` just to reach this function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575801191 From dholmes at openjdk.org Tue Apr 23 07:53:32 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 23 Apr 2024 07:53:32 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v8] In-Reply-To: <_g7eC_1K6WWrGUNHxYyTCDKzSy3Wr06FtEMWWF2Dwrw=.93ab37a4-742d-4502-95de-565e6dad6c01@github.com> References: <_g7eC_1K6WWrGUNHxYyTCDKzSy3Wr06FtEMWWF2Dwrw=.93ab37a4-742d-4502-95de-565e6dad6c01@github.com> Message-ID: On Mon, 22 Apr 2024 22:43:41 GMT, Coleen Phillimore wrote: >> It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > It can be InternalError Looks good! Thanks for making the updates to use InternalError! ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18847#pullrequestreview-2016512085 From fjiang at openjdk.org Tue Apr 23 08:02:35 2024 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 23 Apr 2024 08:02:35 GMT Subject: RFR: 8330735: RISC-V: No need to move sp to tmp register in set_last_Java_frame In-Reply-To: <8fd33mO9lVD6h6KrzRzNeiZqz8-v8a6Fr-4LshUu2l0=.a7ebd433-e79e-4aef-ac6a-7b2718a0dbf2@github.com> References: <8fd33mO9lVD6h6KrzRzNeiZqz8-v8a6Fr-4LshUu2l0=.a7ebd433-e79e-4aef-ac6a-7b2718a0dbf2@github.com> Message-ID: <1DAe_60TMZS5A5o9z79F29GO5p6R5G5SvpGTnxNM8GM=.13235df5-aeea-4c42-a660-57db7cd93520@github.com> On Mon, 22 Apr 2024 02:49:20 GMT, Fei Yang wrote: >> Hi, please review this refactoring to remove the unnecessary move from sp to temp register. >> >> There is no restriction for riscv when using `sp` as an operand in instructions. So we do not have to move the sp register to a temp register before we store `last_java_sp`. >> >> Testing: >> >> - [x] Tier1-3 (linux-riscv64, release) > > Looks fine. Thanks for the cleanup! @RealFYang -- Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18875#issuecomment-2071667441 From fjiang at openjdk.org Tue Apr 23 08:02:36 2024 From: fjiang at openjdk.org (Feilong Jiang) Date: Tue, 23 Apr 2024 08:02:36 GMT Subject: Integrated: 8330735: RISC-V: No need to move sp to tmp register in set_last_Java_frame In-Reply-To: References: Message-ID: On Sat, 20 Apr 2024 12:42:20 GMT, Feilong Jiang wrote: > Hi, please review this refactoring to remove the unnecessary move from sp to temp register. > > There is no restriction for riscv when using `sp` as an operand in instructions. So we do not have to move the sp register to a temp register before we store `last_java_sp`. > > Testing: > > - [x] Tier1-3 (linux-riscv64, release) This pull request has now been integrated. Changeset: 281f9bde Author: Feilong Jiang URL: https://git.openjdk.org/jdk/commit/281f9bdeb9d6870346b12e6c62a58f7984b1b133 Stats: 12 lines in 3 files changed: 0 ins; 4 del; 8 mod 8330735: RISC-V: No need to move sp to tmp register in set_last_Java_frame Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/18875 From dlong at openjdk.org Tue Apr 23 08:10:34 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 23 Apr 2024 08:10:34 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Tue, 23 Apr 2024 05:54:58 GMT, Axel Boldt-Christmas wrote: >>> Deoptimization is already expensive, and this edge case is rare, so I think it would be better to compute the actual previous bytecode here, and not use bci - 1. >> >> Alright I will give that a go then, unless we think the second option is more appropriate. >> >>> Or how about not checking bytecodes and instead checking a flag on JavaThread that says we are in monitor enter native code? >> >> What state would that be? Any current state we setup is after transitioning to VM which seems to late. I guess it would be possible to instrument `SharedRuntime::monitor_enter_helper` with some thread local state we can check. But the interpreter has its own entry point. But I guess we would never reach this point for the interpreter, as there is no such thing as deoptimizing or unpacking interpreted frames. >> >> >> What is the feeling here? What would be more appropriate? Adding some thread local debug state that we set before transitioning to VM in `SharedRuntime::monitor_enter_helper` seems like the most precise solution, but we need to be sure that there are no earlier safepoint polls before this point within the execution of the monitorenter bytecode. > > I see there is `_pending_monitorenter` but this would only handle synchronized method entry. I like the idea of a flag better, because it is foolproof. Why can't we set it in ObjectSynchronizer::enter? I don't think it matters if there is a safepoint check before that, because the lock stack is still consistent at that point. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1575829407 From gli at openjdk.org Tue Apr 23 08:29:27 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 23 Apr 2024 08:29:27 GMT Subject: RFR: 8330822: Remove ModRefBarrierSet::write_ref_array_work In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 13:40:53 GMT, Albert Mingkun Yang wrote: > Simple merging a protected api into another method. Looks good. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18887#pullrequestreview-2016589454 From azafari at openjdk.org Tue Apr 23 08:43:34 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 08:43:34 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <7r1JsiGxPIa7-h0rpD2CbwK_qeqsWwRNEChBXVctsBw=.7ae4b1e5-4b97-4a04-af69-960ec2c47b6b@github.com> On Mon, 22 Apr 2024 12:43:43 GMT, Johan Sj?len wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > src/hotspot/os/windows/os_windows.cpp line 5108: > >> 5106: >> 5107: base = (char*) virtualAlloc(addr, bytes, MEM_COMMIT | MEM_RESERVE, >> 5108: PAGE_READWRITE); > > Why is this removed? We found this call duplicated, since the `MemTracker::record_..._and_commit` is called inside the `os::map_memory` after `pd_map_memory` is called. Here, the requested address is used, but there the result address is used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575876476 From azafari at openjdk.org Tue Apr 23 08:43:35 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 08:43:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <-lsJRmLenHFNlKUscKa9ho4ROYjZVW5MrKlTlov5h5k=.9e9972ed-299e-49b5-9bea-46e602fa6672@github.com> On Tue, 23 Apr 2024 07:16:48 GMT, Thomas Stuefe wrote: >> src/hotspot/share/memory/virtualspace.cpp line 45: >> >>> 43: // Dummy constructor >>> 44: ReservedSpace::ReservedSpace() : _base(nullptr), _size(0), _noaccess_prefix(0), >>> 45: _alignment(0), _fd_for_heap(-1), _special(false), _executable(false), _nmt_flag(mtNone) { >> >> Isn't just `_flag` or `_memflag` sufficient as a name for `ReservedSpace`? We don' use `nmt_flag` anywhere else in the codebase. > > Yes, I would keep consistency with existing code, and maybe later rename all in one followup change Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575876690 From azafari at openjdk.org Tue Apr 23 08:49:37 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 08:49:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 06:12:59 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > src/hotspot/share/cds/metaspaceShared.cpp line 1332: > >> 1330: // NMT: fix up the space tags >> 1331: MemTracker::record_virtual_memory_type(archive_space_rs.base(), mtClassShared); >> 1332: MemTracker::record_virtual_memory_type(class_space_rs.base(), mtClass); > > I assumed these (and others) were removed because the `MemTracker` updates had been pushed down into `ReserveSpace` itself, but I can't find them there - what am I missing? `archive_space_rs` and `class_space_rs` pass the MEMFLAGS to the `ReservedSpace` ctors a few lines above at 1272, 1319 and 1321. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575884704 From azafari at openjdk.org Tue Apr 23 08:59:34 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 08:59:34 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 06:47:00 GMT, Thomas Stuefe wrote: >> src/hotspot/share/gc/parallel/mutableSpace.cpp line 63: >> >>> 61: if (clear_space) { >>> 62: // Prefer page reallocation to migration. >>> 63: os::free_memory((char*)start, size, page_size, mtGC); >> >> Not at all obvious where the corresponding allocation sets the type as mtGC. ?? > > We don't, and I am not sure this is right. AFAICS this API is used on java heap, in ParallelGC. So, that should be mtJavaHeap, I think. > > Note that I would like and plan to simplify this API, if possible remove both the page size and the NMT flags. See https://bugs.openjdk.org/browse/JDK-8330144. (the tricky part is to make sure the proposed Linux alternative works with large pages and on old enough kernels) `os::free_memory` on Linux, re-commits the region to discard the existing committed memory. So MEMFLAGS is needed here to pass down to the `os::commit_memory` there. Since it is actually an uncommit, can we use `mtNone` instead in the `pd_free_memory`? Or define a new `mtDontCare`, for example? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575894505 From azafari at openjdk.org Tue Apr 23 08:59:37 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 08:59:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 07:05:24 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > src/hotspot/share/memory/metaspace/testHelpers.cpp line 81: > >> 79: if (reserve_limit > 0) { >> 80: // have reserve limit -> non-expandable context >> 81: _rs = ReservedSpace(reserve_limit * BytesPerWord, Metaspace::reserve_alignment(), os::vm_page_size(), mtMetaspace); > > I would make this mtTest. This should not increase the metaspace counters in NMT Done. > src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 112: > >> 110: >> 111: // Commit... >> 112: if (os::commit_memory((char*)p, word_size * BytesPerWord, !ExecMem, mtMetaspace) == false) { > > Not sure if I suggested something different in my first review, but thinking this over, this is wrong. Please don't hardwire mtMetaspace; take the flag from the ReservedSpace member of VirtualSpaceNode. > > The reason is that metaspace can be used for at least two different flags, and may later be expanded for more. Done. > src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 191: > >> 189: >> 190: // Uncommit... >> 191: if (os::uncommit_memory((char*)p, word_size * BytesPerWord, !ExecMem, mtMetaspace) == false) { > > Same here. Done. > src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 262: > >> 260: vm_exit_out_of_memory(word_size * BytesPerWord, OOM_MMAP_ERROR, "Failed to reserve memory for metaspace"); >> 261: } >> 262: MemTracker::record_virtual_memory_type(rs.base(), mtMetaspace); > > Looking at this, I don't particularly like it, but it is pre-existing. The fact that we hard-wire mtMetaspace works now relies on the fact that mtClass and mtMetaspace (as of now, the only two flags that are being used) are using different allocation paths. Long term we should change this. Should I create a RFE for it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575896068 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575895625 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575895817 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575899343 From ddong at openjdk.org Tue Apr 23 09:03:40 2024 From: ddong at openjdk.org (Denghui Dong) Date: Tue, 23 Apr 2024 09:03:40 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v11] In-Reply-To: References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: On Fri, 12 Apr 2024 13:08:06 GMT, Denghui Dong wrote: >> There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing. >> >> Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production. >> >> ~~This patch adds a new JFR event `EventSafepointTimeout` to record the threads that cause ttsp too long.~~ >> >> ~~This event includes two fields:~~ >> >> ~~- safepointId: the relevant safepoint id~~ >> ~~- timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint~~ >> >> ~~In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.~~ >> >> A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR. >> >> Any input would be greatly appreciated. >> >> Testing: jdk/jdk/jfr > > Denghui Dong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: > > - Merge branch 'master' into JDK-8326012 > - update > - delete _entries when disabled > - fix test failures > - update > - refactor > - update > - update > - update > - update > - ... and 11 more: https://git.openjdk.org/jdk/compare/0f78d017...df58b055 @stefank Hi, do you have more comments? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17888#issuecomment-2071784426 From azafari at openjdk.org Tue Apr 23 09:05:35 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 09:05:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Tue, 23 Apr 2024 07:46:16 GMT, Stefan Karlsson wrote: >> Was this comment from Stefan addressed? > > David is right, this comment wasn't addressed. The code here went back and forth and we settled on hiding `ReservedMemoryRegion(address base, size_t size)` in a separate RFE. This means we probably should revert the usage of `flag` here and all the places that passes down `flag` just to reach this function. We discussed that having flag here, we can use it for checking if the requested flag matches the actual memory flag or not. This check is missed now. What to do? reverting all the calls up to `os::uncommit_memory()`? and reverting the `ExecMem` param as optional? Or adding check of flags? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1575907604 From tschatzl at openjdk.org Tue Apr 23 09:44:29 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 Apr 2024 09:44:29 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace In-Reply-To: References: Message-ID: <1ZzvLCjrIROXngFcoJgvFqRUKp7ZIqi2r-WmaMSFt_A=.ba452467-a067-4177-9acf-4bee62658f26@github.com> On Mon, 22 Apr 2024 16:24:06 GMT, Guoxiong Li wrote: > Hi all, > > This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. > > The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18894#pullrequestreview-2016759032 From tschatzl at openjdk.org Tue Apr 23 09:44:28 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 Apr 2024 09:44:28 GMT Subject: RFR: 8330822: Remove ModRefBarrierSet::write_ref_array_work In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 13:40:53 GMT, Albert Mingkun Yang wrote: > Simple merging a protected api into another method. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18887#pullrequestreview-2016760560 From mli at openjdk.org Tue Apr 23 09:51:30 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 23 Apr 2024 09:51:30 GMT Subject: RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic In-Reply-To: References: Message-ID: On Wed, 7 Feb 2024 14:35:55 GMT, Yuri Gaevsky wrote: > Hello All, > > Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported. > > Thank you, > -Yuri Gaevsky > > **Correctness checks:** > hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4. Hi, Do you have plan to implement instrinsic `VectorCmpMasked`? It's part of `vectorizedMismatch` ------------- PR Comment: https://git.openjdk.org/jdk/pull/17750#issuecomment-2071878508 From aboldtch at openjdk.org Tue Apr 23 09:58:41 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Apr 2024 09:58:41 GMT Subject: RFR: 8326957: Implementation of JEP 474: ZGC: Generational Mode by Default [v3] In-Reply-To: References: Message-ID: > This is the implementation task for `JEP 474: ZGC: Generational Mode by Default`. See the JEP for details. [JDK-8326667](https://bugs.openjdk.org/browse/JDK-8326667) Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge tag 'jdk-23+19' into JDK-8326957 Added tag jdk-23+19 for changeset 706b421c - Remove extra space - Use consistent terminology - Merge tag 'jdk-23+17' into JDK-8326957 Added tag jdk-23+17 for changeset 8efd7aa6 - Merge tag 'jdk-23+16' into JDK-8326957 Added tag jdk-23+16 for changeset d580bcf9 - Update VMDeprecatedOptions.java test - 8326957: Implementation of Deprecate Non-Generational ZGC ------------- Changes: https://git.openjdk.org/jdk/pull/18393/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18393&range=02 Stats: 107 lines in 7 files changed: 105 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18393.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18393/head:pull/18393 PR: https://git.openjdk.org/jdk/pull/18393 From jsjolen at openjdk.org Tue Apr 23 10:23:30 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 10:23:30 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 01:08:05 GMT, Ioi Lam wrote: >> (This PR is an alternative to https://github.com/openjdk/jdk/pull/18669 with a better API for reading lines of text) >> >> HotSpot has a few cases where information is parsed from a file, or from a memory buffer, one line at a time. Example: >> >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/cds/classListParser.cpp#L169 >> - https://github.com/openjdk/jdk/blob/064628471b83616b4463baa78618d1b7a66d0c7c/src/hotspot/share/compiler/compilerOracle.cpp#L1059-L1066 >> >> Common problems: >> - They use a fixed buffer for reading a line, so long (but valid) lines will cause errors. >> - There's ad-hoc code that deals with `FILE*` differently than from memory. >> >> This RFE implements a common utility, `inputStream`, for reading lines from different sources of input (see `FileInput` and `MemoryInput`). We fixed only `ClassListParser` and `CompilerOracle` in this RFE, but we can fix other readers in follow-up RFEs. >> >> The API allows other source of input to be implemented. For example, one could implement a `SocketInput` if there's a use case for it. >> >> In the future, `inputStream` can be extended (or encapsulated in a higher-level reader class) to read typed input tokens (for example, integers, strings, etc.) >> >> Credit: >> The `inputStream` class and friends are contributed by @rose00 . See https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/087077.html . >> >> John's original version is in the draft PR https://github.com/openjdk/jdk/pull/18773. In order to minimize the size of this PR, I have kept only the functionalities for reading a line and a time. Other features, such as pushing back contents into the `inputStream`, could be added in follow-up PRs. (These removed features can be found in the commit history of this PR). > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' of https://github.com/openjdk/jdk into 8330532-improve-line-oriented-text-parsing-in-hotspot > - Comments fro @coleenp and @matias9927 > - removed more unused code from istream.hpp > - Merged ClassFileParser changes from https://github.com/openjdk/jdk/pull/18669 > - Removed gtest cases for features removed in the previous commit > - Reverted xmlstream.cpp/hpp and removed unused functions from inputStream > - fixed builds > - Imported @jrose00 changes https://github.com/openjdk/jdk/pull/18773 A couple of more comments. And I implemented the simplification I suggested like this: https://github.com/jdksjolen/jdk/commit/b5bc0e945cb96c0a73a91981b937968e5cbac33c Also, I use `override` instead of `virtual`, then we get compiler help if we for some reason (non-matching arg lists for example) don't actually override a super class's virtual function. src/hotspot/share/utilities/istream.hpp line 98: > 96: > 97: Input* _input; // where the input comes from or else nullptr > 98: IState _input_state; // one of {NTR,EOF,ERR}_STATE This comment not necessary as the type describes its valid state. src/hotspot/share/utilities/istream.hpp line 106: > 104: size_t _end; // offset to end of known current line (else content_end) > 105: size_t _next; // offset to known start of next line (else =end) > 106: void* _must_free; // unless null, a malloc pointer which we must free Reading this code, why do we set `_must_free` instead of simply having a method: ```c++ bool must_free() { return _buffer != &_small_buffer; } and just delete the `_must_free` field. ------------- PR Review: https://git.openjdk.org/jdk/pull/18833#pullrequestreview-2016820941 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576001292 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576011247 From jsjolen at openjdk.org Tue Apr 23 10:34:29 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 10:34:29 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: <-khtXmDIMykUvQPloHfqXLN3myblFuJ0lhOlwf7zd5I=.8dbf1a69-daf8-4920-8d1c-2854f053bafc@github.com> <_H29a3k5H0j1CVZdy6kEc_GlfhVL_aCSrf9x8vxwzL8=.9ae0ed1f-4db3-4e7c-a95a-e7622be81a4c@github.com> Message-ID: On Tue, 23 Apr 2024 06:08:36 GMT, John R Rose wrote: >> On a separate note, `preload()` can cause data to be read from the input. For non-seekable input (such as sockets), this doesn't seem like a `const` operation to me. > > I could be wrong about this, but it seems like it?s completely up to the C++ class to define what aspects of its implementation are fixed for its `const` methods. The `const` keyword does not mean ?everything is immutable about this, at all levels and inside all boundaries?. But then it?s up to the class author to choose which boundary `const` applies to. The thing wrapped inside the i-stream is, I will claim, implementation which can change state even in `const` functions. > > If not, the alternative is to have almost no `const` functions at all in the i-stream class. That?s less useful, because `const` means something very useful, in the context of the i-stream class. It means that the line will not shift. You can keep on reading, and even writing, the line buffer, as long as you call only `const` functions. > > The reuse of the line buffer, in this way, is a core part of the i-stream design, and part of its value proposition: You don?t get a new allocation on every read-line op (as you would with Java). But in order to draw bright lines around THAT very useful notion of invariance (during which line buffers are reusable), you NEED `preload` to be const, even if it does internal book keeping, even if it consults the input source at times. > > In short, a rigid idea of `const` is inconsistent with the carefully balanced performance characteristics of this design. I had to read up on this and I believe that John is right. To quote https://isocpp.org/wiki/faq/const-correctness#const-member-fns >The trailing const on inspect() member function should be used to mean the method won?t change the object?s abstract (client-visible) state. That is slightly different from saying the method won?t change the ?raw bits? of the object?s struct. I'm not objecting to the usage of `const` for this method, considering this. Reviewing code is learning on the job! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576028625 From snazarki at openjdk.org Tue Apr 23 10:51:28 2024 From: snazarki at openjdk.org (Sergey Nazarkin) Date: Tue, 23 Apr 2024 10:51:28 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 19 Apr 2024 07:44:17 GMT, Richard Reingruber wrote: >> An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug *gtets* >> - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* >> - [ ] Benchmarking >> >> @apangin and @parttimenerd could you please check the patch on your scenarios?? > > What about granting `WXWrite` only if the current thread is in `_thread_in_vm`? > That would be more restrictive and roughly equivalent how it currently works. Likely there are some places then that should be granted `WXWrite` eagerly because they need `WXWrite` without `_thread_in_vm`. E.g. the JIT compiler threads should have `WXWrite` and never `WXExec` (I assume) which should be checked in the signal handler. The patch doesn't protect against native agents, as this is obviously impossible. The current code doesn't do that either. For the bytecode, it doesn't prevent the attacker from abusing unsafe api to modify code cache. However unsafe functions are already considered "safe" and we proactively enable WXWrite as well as move thread to `_thread_in_vm` state (@reinrich). JITed code can't write to the cache either with or without the patch. I totally get the sense of loss of security. But is this really the case? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2071988941 From ayang at openjdk.org Tue Apr 23 11:09:40 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Apr 2024 11:09:40 GMT Subject: RFR: 8330822: Remove ModRefBarrierSet::write_ref_array_work In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 13:40:53 GMT, Albert Mingkun Yang wrote: > Simple merging a protected api into another method. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18887#issuecomment-2072015268 From ayang at openjdk.org Tue Apr 23 11:09:41 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Apr 2024 11:09:41 GMT Subject: Integrated: 8330822: Remove ModRefBarrierSet::write_ref_array_work In-Reply-To: References: Message-ID: <2dUBFdZOgJc8AmG3iqhXpAO-IE4zn5Mx2DpTl71sGW0=.7b1be617-fb7e-4670-b5f6-bc8b9d236b53@github.com> On Mon, 22 Apr 2024 13:40:53 GMT, Albert Mingkun Yang wrote: > Simple merging a protected api into another method. This pull request has now been integrated. Changeset: 1a6da3d5 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/1a6da3d5f0ac57e173340a117a9368c190a34e8b Stats: 16 lines in 6 files changed: 0 ins; 15 del; 1 mod 8330822: Remove ModRefBarrierSet::write_ref_array_work Reviewed-by: gli, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18887 From ayang at openjdk.org Tue Apr 23 11:23:36 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Apr 2024 11:23:36 GMT Subject: RFR: 8330961: Remove redundant public specifier in ModRefBarrierSet Message-ID: Trivial removing redundant code. ------------- Commit messages: - trivial Changes: https://git.openjdk.org/jdk/pull/18911/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18911&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330961 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18911.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18911/head:pull/18911 PR: https://git.openjdk.org/jdk/pull/18911 From coleenp at openjdk.org Tue Apr 23 11:33:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Apr 2024 11:33:34 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v8] In-Reply-To: <_g7eC_1K6WWrGUNHxYyTCDKzSy3Wr06FtEMWWF2Dwrw=.93ab37a4-742d-4502-95de-565e6dad6c01@github.com> References: <_g7eC_1K6WWrGUNHxYyTCDKzSy3Wr06FtEMWWF2Dwrw=.93ab37a4-742d-4502-95de-565e6dad6c01@github.com> Message-ID: On Mon, 22 Apr 2024 22:43:41 GMT, Coleen Phillimore wrote: >> It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. >> >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > It can be InternalError Thanks for the reviews and comments Ioi, David, Dean, Julian and Doug. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18847#issuecomment-2072056217 From coleenp at openjdk.org Tue Apr 23 11:33:35 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Apr 2024 11:33:35 GMT Subject: RFR: 8330578: The VM creates instance of abstract class VirtualMachineError [v7] In-Reply-To: References: Message-ID: <13kTGwU94iZKWhVWDHRbP1ZgvcnEIJcR-Yz2_oc06KQ=.055139e5-0415-4e3d-9eff-12b74b6994ea@github.com> On Mon, 22 Apr 2024 22:34:47 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/verifier.cpp line 258: >> >>> 256: // to infinitely recurse when we try to initialize the exception. >>> 257: // So bail out here by throwing the preallocated VM error. >>> 258: THROW_OOP_(Universe::class_init_stack_overflow_error(), false); >> >> Should this be InternalError now? That seems better than StackOverflow. > > Technically it's a stack overflow. I don't think it's a reachable code path so it doesn't really matter. changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18847#discussion_r1576093286 From coleenp at openjdk.org Tue Apr 23 11:33:35 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 23 Apr 2024 11:33:35 GMT Subject: Integrated: 8330578: The VM creates instance of abstract class VirtualMachineError In-Reply-To: References: Message-ID: On Thu, 18 Apr 2024 21:46:36 GMT, Coleen Phillimore wrote: > It's a bug that the VM creates an instance of the abstract class VirtualMachineError. In the cases where we throw VME, we should throw OOM or StackOverflowError instead. > > Tested with tier1-4. This pull request has now been integrated. Changeset: fcb4a8ba Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/fcb4a8ba26fe1de596331b0a2f89c5c7c24e7f9e Stats: 17 lines in 7 files changed: 1 ins; 0 del; 16 mod 8330578: The VM creates instance of abstract class VirtualMachineError Reviewed-by: iklam, dlong, jwaters, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/18847 From rcastanedalo at openjdk.org Tue Apr 23 11:34:32 2024 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Tue, 23 Apr 2024 11:34:32 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 03:55:33 GMT, Joshua Zhu wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: > > Add more output for easy debugging once the jtreg test case fails Looks good. I also tested the changeset on Oracle's internal CI (ZGC tests within tiers 1-7, on Neon machines) with an additional patch (https://github.com/openjdk/jdk/commit/963def0415830bc5979c5bb6064a566b1c8040dd) that forces ZGC read barriers to always take the slow path and clears all vector registers upon the slow path's runtime call. Testing succeeded. ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17977#pullrequestreview-2016978929 From jsjolen at openjdk.org Tue Apr 23 11:37:48 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 11:37:48 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v47] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Include growableArray.hpp as we use it. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/e7f2af9e..76c3bcfa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=46 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=45-46 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From sjohanss at openjdk.org Tue Apr 23 11:42:56 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 23 Apr 2024 11:42:56 GMT Subject: RFR: 8330626: ZGC: Windows address space placeholders not managed correctly Message-ID: Please review this fix to correctly manage address space placeholders on Windows. **Summary** On Windows, when using small pages, we use address space placeholders to ensure consistency of the address space. When a portion of the address space is mapped these placeholders are replaced by the actual backing and when doing this the size of the placeholder(s) needs to exactly match the size to be backed. For this reason, whenever address space is in use, we split the covering placeholder into multiple `ZGranuleSize` sized placeholders. During recent investigations into fragmentation of the ZGC address space, I found that there was a code code path (**currently not in use**) that did not properly manage these placeholders and we could end up in situations where no placeholder was split off when a new chunk of `ZGranuleSize` size was request. The problem is basically an off by one problem in the splitting code and the fix is to avoid this by changing it to first split the covering placeholder into two parts before splitting the part to be used into granules. **Testing** * Manual testing using the included GTest as well as sample applications previously triggering the error case. * Tier 1-5 Generational ZGC testing (ongoing) ------------- Commit messages: - Testing improvements - StefanK review - Testing refactored - Test ZMapper_windows - 8330626: ZGC: Windows address space placeholders not managed correctly Changes: https://git.openjdk.org/jdk/pull/18912/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18912&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330626 Stats: 236 lines in 3 files changed: 227 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18912.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18912/head:pull/18912 PR: https://git.openjdk.org/jdk/pull/18912 From jsjolen at openjdk.org Tue Apr 23 11:44:53 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 11:44:53 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v48] In-Reply-To: References: Message-ID: <2h99aRHkAa-EaDvYa4O9gA6odOiUvhVJlhpBqqh-9Fk=.15cdc210-c60e-4d0c-bffd-6f7902f72980@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Chagne some comments - Delete old comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/76c3bcfa..66d5c4f4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=47 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=46-47 Stats: 7 lines in 1 file changed: 1 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From aboldtch at openjdk.org Tue Apr 23 11:48:33 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Apr 2024 11:48:33 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Tue, 23 Apr 2024 08:07:40 GMT, Dean Long wrote: >> I see there is `_pending_monitorenter` but this would only handle synchronized method entry. > > I like the idea of a flag better, because it is foolproof. Why can't we set it in ObjectSynchronizer::enter? I don't think it matters if there is a safepoint check before that, because the lock stack is still consistent at that point. (Currently) the lock stack is always consistent at safepoints w.r.t. what is actually locked. However the lock stack may not be consistent with the most recent lock returned by the leaf `compiledVFrame::monitors()`. But you are correct that it can probably be moved to `ObjectSynchronizer::enter` there are no safepoint polls between `SharedRuntime::monitor_enter_helper` and that point. Similarly there are no safepoints polls in the runtime until after `set_current_pending_monitor` is called. So with these following assumptions. 1. LockStack is consistent at safepoints w.r.t. locked monitors 2. No safepoint polls exist from the point that compiledVFrame::monitors() starts returning the monitorinfo for the currently executing monitorenter until either it calls into the runtime or finishes locking. I do not believe 1. is likely to ever change. But I have limited understanding of the validity of 2. nor if it something that can change. If both these assumptions are correct than simply skipping the verification when `deoptee->current_pending_monitor() != nullptr` would suffice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1576112333 From azafari at openjdk.org Tue Apr 23 12:23:35 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 12:23:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Tue, 23 Apr 2024 09:02:49 GMT, Afshin Zafari wrote: >> David is right, this comment wasn't addressed. The code here went back and forth and we settled on hiding `ReservedMemoryRegion(address base, size_t size)` in a separate RFE. This means we probably should revert the usage of `flag` here and all the places that passes down `flag` just to reach this function. > > We discussed that having flag here, we can use it for checking if the requested flag matches the actual memory flag or not. This check is missed now. > What to do? reverting all the calls up to `os::uncommit_memory()`? and reverting the `ExecMem` param as optional? > > Or adding check of flags? New info: I added the check and it always failed, meaning that no uncommit flag matches with commit flag. I will remove the mandatory flag from uncommit, but it hides the issue of non-matching commit-uncommit flags. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1576156475 From sjohanss at openjdk.org Tue Apr 23 12:37:46 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Tue, 23 Apr 2024 12:37:46 GMT Subject: RFR: 8330626: ZGC: Windows address space placeholders not managed correctly [v2] In-Reply-To: References: Message-ID: > Please review this fix to correctly manage address space placeholders on Windows. > > **Summary** > On Windows, when using small pages, we use address space placeholders to ensure consistency of the address space. > > When a portion of the address space is mapped these placeholders are replaced by the actual backing and when doing this the size of the placeholder(s) needs to exactly match the size to be backed. For this reason, whenever address space is in use, we split the covering placeholder into multiple `ZGranuleSize` sized placeholders. > > During recent investigations into fragmentation of the ZGC address space, I found that there was a code code path (**currently not in use**) that did not properly manage these placeholders and we could end up in situations where no placeholder was split off when a new chunk of `ZGranuleSize` size was request. The problem is basically an off by one problem in the splitting code and the fix is to avoid this by changing it to first split the covering placeholder into two parts before splitting the part to be used into granules. > > **Testing** > * Manual testing using the included GTest as well as sample applications previously triggering the error case. > * Tier 1-5 Generational ZGC testing (ongoing) Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: - Move GTEST_SKIP to setup function - StefanK review 2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18912/files - new: https://git.openjdk.org/jdk/pull/18912/files/46d618f3..cae16fec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18912&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18912&range=00-01 Stats: 21 lines in 1 file changed: 8 ins; 12 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18912.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18912/head:pull/18912 PR: https://git.openjdk.org/jdk/pull/18912 From stefank at openjdk.org Tue Apr 23 12:56:29 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 23 Apr 2024 12:56:29 GMT Subject: RFR: 8330626: ZGC: Windows address space placeholders not managed correctly [v2] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:37:46 GMT, Stefan Johansson wrote: >> Please review this fix to correctly manage address space placeholders on Windows. >> >> **Summary** >> On Windows, when using small pages, we use address space placeholders to ensure consistency of the address space. >> >> When a portion of the address space is mapped these placeholders are replaced by the actual backing and when doing this the size of the placeholder(s) needs to exactly match the size to be backed. For this reason, whenever address space is in use, we split the covering placeholder into multiple `ZGranuleSize` sized placeholders. >> >> During recent investigations into fragmentation of the ZGC address space, I found that there was a code code path (**currently not in use**) that did not properly manage these placeholders and we could end up in situations where no placeholder was split off when a new chunk of `ZGranuleSize` size was request. The problem is basically an off by one problem in the splitting code and the fix is to avoid this by changing it to first split the covering placeholder into two parts before splitting the part to be used into granules. >> >> **Testing** >> * Manual testing using the included GTest as well as sample applications previously triggering the error case. >> * Tier 1-5 Generational ZGC testing (ongoing) > > Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: > > - Move GTEST_SKIP to setup function > - StefanK review 2 Looks good. Thanks for finding and fixing this issue! ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18912#pullrequestreview-2017165768 From aboldtch at openjdk.org Tue Apr 23 13:07:28 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 23 Apr 2024 13:07:28 GMT Subject: RFR: 8330626: ZGC: Windows address space placeholders not managed correctly [v2] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:37:46 GMT, Stefan Johansson wrote: >> Please review this fix to correctly manage address space placeholders on Windows. >> >> **Summary** >> On Windows, when using small pages, we use address space placeholders to ensure consistency of the address space. >> >> When a portion of the address space is mapped these placeholders are replaced by the actual backing and when doing this the size of the placeholder(s) needs to exactly match the size to be backed. For this reason, whenever address space is in use, we split the covering placeholder into multiple `ZGranuleSize` sized placeholders. >> >> During recent investigations into fragmentation of the ZGC address space, I found that there was a code code path (**currently not in use**) that did not properly manage these placeholders and we could end up in situations where no placeholder was split off when a new chunk of `ZGranuleSize` size was request. The problem is basically an off by one problem in the splitting code and the fix is to avoid this by changing it to first split the covering placeholder into two parts before splitting the part to be used into granules. >> >> **Testing** >> * Manual testing using the included GTest as well as sample applications previously triggering the error case. >> * Tier 1-5 Generational ZGC testing (ongoing) > > Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: > > - Move GTEST_SKIP to setup function > - StefanK review 2 The changes looks good. There might be some value in adding a test which tests all 4 (or 6) ways we can add an area to the free list. When freeing Area `A` 1. `A` Not adjacent - Uses `create` 2. `A` In-between two areas (adjacent front and back of free areas) - Uses `grow_from_back` 3. `A` Only adjacent front of freed area - Uses `grow_from_front` 4. `A` Only adjacent back of freed area - Uses `grow_from_back` The rest are just special cases when we add at the end. 5. `A` Not adjacent, higher than all freed areas - Uses `create` 6. `A` Only adjacent back of last freed area - Uses `grow_from_back` ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18912#pullrequestreview-2017194931 From jsjolen at openjdk.org Tue Apr 23 13:11:43 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 13:11:43 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Tue, 23 Apr 2024 12:20:29 GMT, Afshin Zafari wrote: >> We discussed that having flag here, we can use it for checking if the requested flag matches the actual memory flag or not. This check is missed now. >> What to do? reverting all the calls up to `os::uncommit_memory()`? and reverting the `ExecMem` param as optional? >> >> Or adding check of flags? > > New info: > I added the check and it always failed, meaning that no uncommit flag matches with commit flag. > I will remove the mandatory flag from uncommit, but it hides the issue of non-matching commit-uncommit flags. Keeping the flag argument for `os::uncommit_memory` is important as it is equivalent to `reserve`:ing the memory. This makes future work easier, as we don't have to look at the region to figure out what flag it needs to be reserved as. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1576228846 From jsjolen at openjdk.org Tue Apr 23 13:44:59 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 23 Apr 2024 13:44:59 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Move TreapNode into Treap ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/66d5c4f4..c0ddb9ff Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=48 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=47-48 Stats: 92 lines in 2 files changed: 34 ins; 37 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From tschatzl at openjdk.org Tue Apr 23 14:59:27 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 Apr 2024 14:59:27 GMT Subject: RFR: 8330961: Remove redundant public specifier in ModRefBarrierSet In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 11:15:33 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. Trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18911#pullrequestreview-2017521611 From ayang at openjdk.org Tue Apr 23 15:04:34 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Apr 2024 15:04:34 GMT Subject: RFR: 8330961: Remove redundant public specifier in ModRefBarrierSet In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 11:15:33 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18911#issuecomment-2072598729 From ayang at openjdk.org Tue Apr 23 15:04:35 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 Apr 2024 15:04:35 GMT Subject: Integrated: 8330961: Remove redundant public specifier in ModRefBarrierSet In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 11:15:33 GMT, Albert Mingkun Yang wrote: > Trivial removing redundant code. This pull request has now been integrated. Changeset: 2ea89268 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/2ea89268a1af501fef4c1505a487e9ef5d5bda87 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8330961: Remove redundant public specifier in ModRefBarrierSet Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18911 From matsaave at openjdk.org Tue Apr 23 15:05:33 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 23 Apr 2024 15:05:33 GMT Subject: RFR: 8330388: Remove invokedynamic cache index encoding In-Reply-To: <-5i_BDguO1qWOP0GnYK4pTeMMW4IhlV3LkqLPFs4vAw=.060c849a-de1a-4888-943e-80b9ed4eecf2@github.com> References: <-5i_BDguO1qWOP0GnYK4pTeMMW4IhlV3LkqLPFs4vAw=.060c849a-de1a-4888-943e-80b9ed4eecf2@github.com> Message-ID: On Wed, 17 Apr 2024 22:48:16 GMT, Dean Long wrote: >> Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. >> >> The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. The changes show no issues when tested against libgraal. > > Did you consider minimizing changes by leaving decode_invokedynamic_index/encode_invokedynamic_index calls in place, but having the implementations not change the value? Thanks for the reviews @dean-long @gilles-duboscq @coleenp and @plummercj! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18819#issuecomment-2072603376 From matsaave at openjdk.org Tue Apr 23 15:05:34 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 23 Apr 2024 15:05:34 GMT Subject: Integrated: 8330388: Remove invokedynamic cache index encoding In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 15:26:52 GMT, Matias Saavedra Silva wrote: > Before [JDK-8307190](https://bugs.openjdk.org/browse/JDK-8307190), [JDK-8309673](https://bugs.openjdk.org/browse/JDK-8309673), and [JDK-8301995](https://bugs.openjdk.org/browse/JDK-8301995), invokedynamic operands needed to be rewritten to encoded values to better distinguish indy entries from other cp cache entries. The above changes now distinguish between entries with `to_cp_index()` using the bytecode, which is now propagated by the callers. > > The encoding flips the bits of the index so the encoded index is always negative, leading to access errors if there is no matching decode call. These calls are removed with some methods adjusted to distinguish between indices with the bytecode. Verified with tier 1-5 tests. The changes show no issues when tested against libgraal. This pull request has now been integrated. Changeset: 383fe6ea Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/383fe6eaab423a1218c9915362f691472e3773e7 Stats: 225 lines in 37 files changed: 15 ins; 137 del; 73 mod 8330388: Remove invokedynamic cache index encoding Reviewed-by: cjplummer, dlong, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/18819 From aph at openjdk.org Tue Apr 23 15:13:29 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 23 Apr 2024 15:13:29 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 19 Apr 2024 07:44:17 GMT, Richard Reingruber wrote: >> An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug *gtets* >> - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* >> - [ ] Benchmarking >> >> @apangin and @parttimenerd could you please check the patch on your scenarios?? > > What about granting `WXWrite` only if the current thread is in `_thread_in_vm`? > That would be more restrictive and roughly equivalent how it currently works. Likely there are some places then that should be granted `WXWrite` eagerly because they need `WXWrite` without `_thread_in_vm`. E.g. the JIT compiler threads should have `WXWrite` and never `WXExec` (I assume) which should be checked in the signal handler. > The patch doesn't protect against native agents, as this is obviously impossible. The current code doesn't do that either. For the bytecode, it doesn't prevent the attacker from abusing unsafe api to modify code cache. However unsafe functions are already considered "safe" and we proactively enable WXWrite as well as move thread to `_thread_in_vm` state (@reinrich). JITed code can't write to the cache either with or without the patch. > > I totally get the sense of loss of security. But is this really the case? I think it is. W^X is intended (amongst other things) to protect against the use of gadgets, from buffer overflow exploits in non-java code to ROP programming. At present, in order to generate code and execute it, you first have to be able to make the JIT code writable, then write the code, then make it executable. then jump to the code. And the exploit writer might have to do some or all of this by finding gadgets. If we were to merge this patch then all the attacker would have to do is write code to memory and find a way to jump to it, and the automatic switch-on-segfault in this patch would do the all the work the attacker needs. It makes far more sense to tag those places that actually need to change W^X access, and only switch there. You could argue that any switching of W^X on a write to code space, then switching it back on jumping (or returning) to Java code, even what we already do, is effectively the same thing. Kinda, but it's not on just any attempt to write to code space or any attempt to jump into code, it's at the places we choose, and we can be careful to limit those places. But surely the JDK is not the most vulnerable part of the stack anyway? I'd agree with that, of course, but I don't think that's sufficient reason to decide to bypass an OS security mechanism. We are trying to reduce the size of the attack surface. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2072639243 From azafari at openjdk.org Tue Apr 23 15:43:34 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 15:43:34 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 13:44:59 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move TreapNode into Treap Thanks again for the great work you've done for this PR. Some small suggestions are given. src/hotspot/share/utilities/nativeCallStack.hpp line 1: > 1: /* Copyright. test/hotspot/jtreg/gc/z/TestZNMT.java line 1: > 1: /* Copyright. ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2017130955 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576474648 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576480678 From azafari at openjdk.org Tue Apr 23 15:43:47 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 23 Apr 2024 15:43:47 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 16:00:51 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Remove faulty condition after removing merging > - Add failing test case src/hotspot/share/gc/z/zInitialize.cpp line 1: > 1: /* Copyright is to be updated. src/hotspot/share/gc/z/zNMT.cpp line 45: > 43: > 44: void ZNMT::commit(zoffset offset, size_t size) { > 45: MemTracker::allocate_memory_in(ZNMT::_device, untype(offset), size, mtJavaHeap, CALLER_PC); `NativeCallStack` param should be before the `MEMFLAGS` param, same as other functions. src/hotspot/share/nmt/memBaseline.cpp line 1: > 1: /* Copyright to be updated. src/hotspot/share/nmt/memReporter.cpp line 1: > 1: /* Copyright. src/hotspot/share/nmt/memReporter.hpp line 1: > 1: /* Copyright. src/hotspot/share/nmt/memTracker.cpp line 1: > 1: /* Copyright. src/hotspot/share/nmt/memTracker.hpp line 1: > 1: /* Copyright. src/hotspot/share/nmt/memTracker.hpp line 177: > 175: } > 176: > 177: static inline void remove_device(MemoryFileTracker::MemoryFile* device) { check device != nullptr. src/hotspot/share/nmt/memTracker.hpp line 184: > 182: } > 183: > 184: static inline void allocate_memory_in(MemoryFileTracker::MemoryFile* device, size_t offset, size_t size, invalid args: `nullptr` and `size == 0`. src/hotspot/share/nmt/memTracker.hpp line 192: > 190: } > 191: > 192: static inline void free_memory_in(MemoryFileTracker::MemoryFile* device, same here. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 43: > 41: } > 42: > 43: void MemoryFileTracker::allocate_memory(MemoryFile* device, size_t offset, check/assert `device == nullptr`. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 45: > 43: void MemoryFileTracker::allocate_memory(MemoryFile* device, size_t offset, > 44: size_t size, MEMFLAGS flag, > 45: const NativeCallStack& stack) { indentation does not match with the line above. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 47: > 45: const NativeCallStack& stack) { > 46: NativeCallStackStorage::StackIndex sidx = _stack_storage.push(stack); > 47: DeviceSpace::Metadata metadata(sidx, flag); Can `Metadata` ctor gets a `NaticeCallStack` instead of an index? StackIndex is not used for the rest of the code. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 48: > 46: NativeCallStackStorage::StackIndex sidx = _stack_storage.push(stack); > 47: DeviceSpace::Metadata metadata(sidx, flag); > 48: DeviceSpace::SummaryDiff diff = device->_tree.reserve_mapping(offset, size, metadata); What if `size == 0`? src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 52: > 50: const VMATree::SingleDiff& rescom = diff.flag[i]; > 51: VirtualMemory* summary = device->_summary.by_type(NMTUtil::index_to_flag(i)); > 52: summary->reserve_memory(rescom.reserve); `diff.flag[i]` can be used instead of `rescom` and the corresponding line can be removed. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 56: > 54: } > 55: > 56: void MemoryFileTracker::free_memory(MemoryFile* device, size_t offset, size_t size) { same comments here. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 65: > 63: } > 64: > 65: void MemoryFileTracker::print_report_on(const MemoryFile* device, outputStream* stream, size_t scale) { check for invalid arguments: `nullptr` and `0`. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 85: > 83: NMTUtil::scale_name(scale), > 84: NMTUtil::flag_to_name(pval.out.metadata().flag)); > 85: pval.out.metadata().stack_idx.stack().print_on(stream, 4); Why hard coded `4`? Is it the depth of stack? src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 85: > 83: NMTUtil::scale_name(scale), > 84: NMTUtil::flag_to_name(pval.out.metadata().flag)); > 85: pval.out.metadata().stack_idx.stack().print_on(stream, 4); Also, if `IntervalChange` has some wrappers we can write: `pval.out_stack().print_on()` or `pval.out_type()`. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 121: > 119: } > 120: > 121: void MemoryFileTracker::Instance::allocate_memory(MemoryFile* device, size_t offset, check invalid args. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 123: > 121: void MemoryFileTracker::Instance::allocate_memory(MemoryFile* device, size_t offset, > 122: size_t size, MEMFLAGS flag, > 123: const NativeCallStack& stack) { indentation ... src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 124: > 122: size_t size, MEMFLAGS flag, > 123: const NativeCallStack& stack) { > 124: _tracker->allocate_memory(device, offset, size, flag, stack); `_tracker` can be `nullptr` if `initialize` is not called or if it failed to allocate. Maybe `!enabled()` is to be used here to check it. This also applies for any further use of `_tracker` in the subsequent functions. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 127: > 125: } > 126: > 127: void MemoryFileTracker::Instance::free_memory(MemoryFile* device, size_t offset, check of invalid args. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 128: > 126: > 127: void MemoryFileTracker::Instance::free_memory(MemoryFile* device, size_t offset, > 128: size_t size) { indentation. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 138: > 136: > 137: void MemoryFileTracker::Instance::print_report_on(const MemoryFile* device, > 138: outputStream* stream, size_t scale) { indentation. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 152: > 150: auto snap = snapshot->by_type(NMTUtil::index_to_flag(i)); > 151: auto current = device->_summary.by_type(NMTUtil::index_to_flag(i)); > 152: // PDT stores the memory as reserved but it's accounted as committed. What does PDT stand for? src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 153: > 151: auto current = device->_summary.by_type(NMTUtil::index_to_flag(i)); > 152: // PDT stores the memory as reserved but it's accounted as committed. > 153: snap->commit_memory(current->reserved()); `VirtualMemorySnapshot` contains both `reserved` and `committed` amounts. If `current->reserved()` is used for `committed` amount, what is the `reserved` amount then? src/hotspot/share/nmt/nmtMemoryFileTracker.hpp line 49: > 47: > 48: // Each device has its own memory space. > 49: using DeviceSpace = VMATree; `DeviceSpace` is used only 3 times in 2 methods in cpp file. `VMATree` is also used all around the code. `VMATree` is preferable then. src/hotspot/share/nmt/nmtMemoryFileTracker.hpp line 62: > 60: MemoryFile(const char* descriptive_name) > 61: : _descriptive_name(descriptive_name) { > 62: } can fit into 1 line. src/hotspot/share/nmt/nmtTreap.hpp line 26: > 24: > 25: #ifndef SHARE_NMT_TREAP_HPP > 26: #define SHARE_NMT_TREAP_HPP SHARE_NMT_NMTTREAP_HPP src/hotspot/share/nmt/nmtTreap.hpp line 285: > 283: using TreapCHeap = Treap; > 284: > 285: #endif //SHARE_NMT_TREAP_HPP SHARE_NMT_NMTTREAP_HPP src/hotspot/share/nmt/virtualMemoryTracker.hpp line 1: > 1: /* Copyright. src/hotspot/share/nmt/vmatree.cpp line 36: > 34: MEMFLAGS flag_out() const { > 35: return state.out.metadata().flag; > 36: } can fit in 1 line. src/hotspot/share/nmt/vmatree.cpp line 42: > 40: // Motivating example: reserve(0,100, mtNMT); reserve(50,75, mtTest); > 41: // This will require the 2nd call to know which region the second reserve 'smashes' a hole into for proper summary accounting. > 42: // LEQ_A is figured out a bit later on, as we need to find it for other purposes anyway. Let's have a right margin for this part, at column 85 for example. src/hotspot/share/nmt/vmatree.cpp line 188: > 186: // LEQ_A - A - B - GEQ_B > 187: auto& rescom = diff.flag[NMTUtil::flag_to_index(LEQ_A.flag_out())]; > 188: if (LEQ_A.state.out.type() == StateType::Reserved) { Would be nice to have wrappers that allow us write these as: `LEQ_A.is_out_reserved()` or `LEQ_A.is_out_committed()` src/hotspot/share/nmt/vmatree.cpp line 201: > 199: }); > 200: > 201: AddressState prev = {A, stA}; // stA is just filler `AddressState prev{A, stA};` would be like other instances of ctor. I.e., remove the `=` sign. src/hotspot/share/nmt/vmatree.cpp line 242: > 240: } > 241: return diff; > 242: } Would be nice if we can break this function into some smaller sub-functions. It is 200+ line now and little hard to track the logic. Thanks! src/hotspot/share/nmt/vmatree.hpp line 44: > 42: class AddressComparator { > 43: public: > 44: static int cmp(size_t a, size_t b) { Why `size_t` and not `address`? src/hotspot/share/nmt/vmatree.hpp line 56: > 54: > 55: // Each point has some stack and a flag associated with it. > 56: struct Metadata { `State` and `Metadata` are attributes of a Node and not to be in VMATree. src/hotspot/share/nmt/vmatree.hpp line 63: > 61: : stack_idx(), > 62: flag(mtNone) { > 63: } can fit in 1 line. src/hotspot/share/nmt/vmatree.hpp line 70: > 68: static bool equals(const Metadata& a, const Metadata& b) { > 69: return NativeCallStackStorage::StackIndex::equals(a.stack_idx, b.stack_idx) && > 70: a.flag == b.flag; `a.flag == b.flag` can be left-hand of `&&` to be more efficient. src/hotspot/share/nmt/vmatree.hpp line 113: > 111: }; > 112: > 113: using VTreap = TreapNode; Why `VTreap` and not `TreapNode`? What does the `V` alone say? src/hotspot/share/nmt/vmatree.hpp line 118: > 116: VMATree() > 117: : tree() { > 118: } fit into 1 line. src/hotspot/share/nmt/vmatree.hpp line 135: > 133: SummaryDiff register_mapping(size_t A, size_t B, StateType state, Metadata& metadata); > 134: > 135: SummaryDiff reserve_mapping(size_t from, size_t sz, Metadata& metadata) { If we use `reserve_mapping` for `uncommit_memory`, we need to set a `StackIndex` and a `MEMFLAGS` to pass as a `Metadata`. If we use `mtNone` for example, all the uncommitted amount would be accounted for `mtNone`. Would you please provide a `uncommit_mapping(address, size)` to handle these issues properly? src/hotspot/share/nmt/vmatree.hpp line 139: > 137: } > 138: > 139: SummaryDiff commit_mapping(size_t from, size_t sz, Metadata& metadata) { `size_t` or `address` for `from`? src/hotspot/share/nmt/vmatree.hpp line 145: > 143: SummaryDiff release_mapping(size_t from, size_t sz) { > 144: Metadata empty; > 145: return register_mapping(from, from + sz, StateType::Released, empty); `return register_mapping(from, from + sz, StateType::Released, Metadata{});` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576183035 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576230158 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576236166 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576237931 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576240387 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576242164 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576244061 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576246868 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576247885 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576248209 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576191957 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576254696 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576196658 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576201734 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576203893 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576205419 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576207394 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576213967 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576219519 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576256738 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576255953 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576278534 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576265787 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576257264 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576281076 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576285582 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576287500 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576297271 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576300493 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576310421 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576324616 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576325346 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576328824 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576328036 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576352240 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576349129 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576360411 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576362112 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576399517 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576363348 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576371244 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576408517 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576404851 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576387513 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576403278 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576388843 From gli at openjdk.org Tue Apr 23 16:04:37 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 23 Apr 2024 16:04:37 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace [v2] In-Reply-To: References: Message-ID: > Hi all, > > This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. > > The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Remove file. - Merge branch 'master' into REMOVE_TENURED_SPACE # Conflicts: # src/hotspot/share/gc/serial/defNewGeneration.inline.hpp - JDK-8330155 ------------- Changes: https://git.openjdk.org/jdk/pull/18894/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18894&range=01 Stats: 161 lines in 20 files changed: 10 ins; 127 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/18894.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18894/head:pull/18894 PR: https://git.openjdk.org/jdk/pull/18894 From epeter at openjdk.org Tue Apr 23 16:11:31 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 23 Apr 2024 16:11:31 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 22:10:56 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to arrays_equals Looks reasonable. Can you apply the indentation issue, please? src/hotspot/cpu/x86/macroAssembler_x86.hpp line 992: > 990: // * No condition for this * void ALWAYSINLINE jecxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } > 991: > 992: // Short versions of the above Suggestion: // * No condition for this * void ALWAYSINLINE jcxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } // * No condition for this * void ALWAYSINLINE jecxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } // Short versions of the above src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1024: > 1022: void ALWAYSINLINE jpo_b(Label& L) { jccb(Assembler::noParity, L); } > 1023: // * No condition for this * void ALWAYSINLINE jcxz_b(Label& L) { jccb(Assembler::cxz, L); } > 1024: // * No condition for this * void ALWAYSINLINE jecxz_b(Label& L) { jccb(Assembler::cxz, L); } Suggestion: // * No condition for this * void ALWAYSINLINE jcxz_b(Label& L) { jccb(Assembler::cxz, L); } // * No condition for this * void ALWAYSINLINE jecxz_b(Label& L) { jccb(Assembler::cxz, L); } ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18893#pullrequestreview-2017688168 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576516938 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576517298 From epeter at openjdk.org Tue Apr 23 16:11:31 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 23 Apr 2024 16:11:31 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: <296gr70D3-VHUuwQSXcoRpK9jeNArpSYmJEN6u5rc8Y=.06167515-96dd-45dc-8bb2-ae26c7e967e9@github.com> On Tue, 23 Apr 2024 16:03:42 GMT, Emanuel Peter wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert changes to arrays_equals > > src/hotspot/cpu/x86/macroAssembler_x86.hpp line 992: > >> 990: // * No condition for this * void ALWAYSINLINE jecxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } >> 991: >> 992: // Short versions of the above > > Suggestion: > > // * No condition for this * void ALWAYSINLINE jcxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } > // * No condition for this * void ALWAYSINLINE jecxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } > > // Short versions of the above Everywhere else it is indented, so it would be nice if this kept the style ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576523367 From sgibbons at openjdk.org Tue Apr 23 16:18:42 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 16:18:42 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v4] In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: > Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Comment indentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18893/files - new: https://git.openjdk.org/jdk/pull/18893/files/f7d7f7de..fe5f3060 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=02-03 Stats: 5 lines in 1 file changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18893.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18893/head:pull/18893 PR: https://git.openjdk.org/jdk/pull/18893 From epeter at openjdk.org Tue Apr 23 16:18:42 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 23 Apr 2024 16:18:42 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v4] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Tue, 23 Apr 2024 16:15:48 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Comment indentation Marked as reviewed by epeter (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18893#pullrequestreview-2017712371 From epeter at openjdk.org Tue Apr 23 16:18:42 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 23 Apr 2024 16:18:42 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 22:10:56 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes to arrays_equals Thanks for the update. I can sponsor as soon as you attempt integration again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2072847673 From sgibbons at openjdk.org Tue Apr 23 16:18:42 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 16:18:42 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: <296gr70D3-VHUuwQSXcoRpK9jeNArpSYmJEN6u5rc8Y=.06167515-96dd-45dc-8bb2-ae26c7e967e9@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> <296gr70D3-VHUuwQSXcoRpK9jeNArpSYmJEN6u5rc8Y=.06167515-96dd-45dc-8bb2-ae26c7e967e9@github.com> Message-ID: <4o4Twie-xnr15IYktMLGyAv8CC6Gp-A0hrwQO8Xozy0=.e58aafe0-bc28-456e-a8de-5973b3a10388@github.com> On Tue, 23 Apr 2024 16:08:25 GMT, Emanuel Peter wrote: >> src/hotspot/cpu/x86/macroAssembler_x86.hpp line 992: >> >>> 990: // * No condition for this * void ALWAYSINLINE jecxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } >>> 991: >>> 992: // Short versions of the above >> >> Suggestion: >> >> // * No condition for this * void ALWAYSINLINE jcxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } >> // * No condition for this * void ALWAYSINLINE jecxz(Label& L, bool maybe_short = true) { jcc(Assembler::cxz, L, maybe_short); } >> >> // Short versions of the above > > Everywhere else it is indented, so it would be nice if this kept the style Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576529956 From sgibbons at openjdk.org Tue Apr 23 16:18:42 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 16:18:42 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v3] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Tue, 23 Apr 2024 16:03:55 GMT, Emanuel Peter wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert changes to arrays_equals > > src/hotspot/cpu/x86/macroAssembler_x86.hpp line 1024: > >> 1022: void ALWAYSINLINE jpo_b(Label& L) { jccb(Assembler::noParity, L); } >> 1023: // * No condition for this * void ALWAYSINLINE jcxz_b(Label& L) { jccb(Assembler::cxz, L); } >> 1024: // * No condition for this * void ALWAYSINLINE jecxz_b(Label& L) { jccb(Assembler::cxz, L); } > > Suggestion: > > // * No condition for this * void ALWAYSINLINE jcxz_b(Label& L) { jccb(Assembler::cxz, L); } > // * No condition for this * void ALWAYSINLINE jecxz_b(Label& L) { jccb(Assembler::cxz, L); } Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576529780 From stuefe at openjdk.org Tue Apr 23 16:49:34 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 23 Apr 2024 16:49:34 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 08:56:55 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 262: >> >>> 260: vm_exit_out_of_memory(word_size * BytesPerWord, OOM_MMAP_ERROR, "Failed to reserve memory for metaspace"); >>> 261: } >>> 262: MemTracker::record_virtual_memory_type(rs.base(), mtMetaspace); >> >> Looking at this, I don't particularly like it, but it is pre-existing. The fact that we hard-wire mtMetaspace works now relies on the fact that mtClass and mtMetaspace (as of now, the only two flags that are being used) are using different allocation paths. Long term we should change this. > > Should I create a RFE for it? Sure, go ahead. You can assign this to me, if you want. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1576572430 From sgibbons at openjdk.org Tue Apr 23 16:56:31 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 16:56:31 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v4] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Tue, 23 Apr 2024 16:18:42 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Comment indentation Thank you. I'm waiting on @sviswa7 review before integrating again ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2072920449 From gli at openjdk.org Tue Apr 23 17:22:41 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 23 Apr 2024 17:22:41 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace [v3] In-Reply-To: References: Message-ID: > Hi all, > > This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. > > The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: Fix included header file error after merging master. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18894/files - new: https://git.openjdk.org/jdk/pull/18894/files/0796e0b4..5478742c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18894&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18894&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18894.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18894/head:pull/18894 PR: https://git.openjdk.org/jdk/pull/18894 From gli at openjdk.org Tue Apr 23 17:33:31 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 23 Apr 2024 17:33:31 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace [v3] In-Reply-To: References: Message-ID: <13XcdYfz-Qdc4pOKtZK0IIYJM7xrCFuiM4LSpgQ2JhE=.26e5fa50-0d25-4e3e-b603-4299cf837614@github.com> On Tue, 23 Apr 2024 17:22:41 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. >> >> The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix included header file error after merging master. I merged the master branch in order to solve the file conflict and added the missed header file after merging. Please take a look at the newest code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18894#issuecomment-2072993947 From sspitsyn at openjdk.org Tue Apr 23 18:14:28 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 23 Apr 2024 18:14:28 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Ping!! Still need this reviewed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18806#issuecomment-2073075364 From sviswanathan at openjdk.org Tue Apr 23 18:31:31 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 23 Apr 2024 18:31:31 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v4] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Tue, 23 Apr 2024 16:18:42 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Comment indentation src/hotspot/cpu/x86/assembler_x86.cpp line 1835: > 1833: prefix(dst, reg); > 1834: emit_int8((unsigned char)0x39); > 1835: emit_operand(reg, dst, 1); This should be emit_operand(reg, dst, 0); src/hotspot/cpu/x86/assembler_x86.cpp line 4459: > 4457: } > 4458: > 4459: void Assembler::vpcmpeqb(XMMRegister dst, XMMRegister src1, Address src2, int vector_len) { InstructionMark missing in this instruction as well. src/hotspot/cpu/x86/assembler_x86.cpp line 4576: > 4574: // In this context, the dst vector contains the components that are equal, non equal components are zeroed in dst > 4575: void Assembler::vpcmpeqw(XMMRegister dst, XMMRegister nds, Address src, int vector_len) { > 4576: assert(vector_len == AVX_128bit ? VM_Version::supports_avx() : VM_Version::supports_avx2(), ""); InstructionMark missing in this instruction which takes Address as operand? src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3573: > 3571: } > 3572: > 3573: void MacroAssembler::vpcmpeqb(XMMRegister dst, XMMRegister src1, Address src2, int vector_len) { The assert is missing here: assert(((dst->encoding() < 16 && src1->encoding() < 16) || VM_Version::supports_avx512vlbw()),"XMM register should be 0-15"); src/hotspot/cpu/x86/macroAssembler_x86.hpp line 961: > 959: void ALWAYSINLINE jo(Label& L, bool maybe_short = true) { jcc(Assembler::overflow, L, maybe_short); } > 960: void ALWAYSINLINE jno(Label& L, bool maybe_short = true) { jcc(Assembler::noOverflow, L, maybe_short); } > 961: void ALWAYSINLINE js(Label& L, bool maybe_short = true) { jcc(Assembler::positive, L, maybe_short); } Isn't js -> jump is sign flag is set -> Assembler::negative? Correspondingly jns, js_b, jns_b should also be corrected. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576695214 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576714792 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576711681 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576666734 PR Review Comment: https://git.openjdk.org/jdk/pull/18893#discussion_r1576673354 From jrose at openjdk.org Tue Apr 23 18:37:32 2024 From: jrose at openjdk.org (John R Rose) Date: Tue, 23 Apr 2024 18:37:32 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: Message-ID: <4__55RnizjcZwBGgP4QlfXXX6HBzn5jbRn_xrRPE4uM=.994bc41d-4bb3-4b63-b6dc-b533b598d0a6@github.com> On Tue, 23 Apr 2024 10:08:37 GMT, Johan Sj?len wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: >> >> - Merge branch 'master' of https://github.com/openjdk/jdk into 8330532-improve-line-oriented-text-parsing-in-hotspot >> - Comments fro @coleenp and @matias9927 >> - removed more unused code from istream.hpp >> - Merged ClassFileParser changes from https://github.com/openjdk/jdk/pull/18669 >> - Removed gtest cases for features removed in the previous commit >> - Reverted xmlstream.cpp/hpp and removed unused functions from inputStream >> - fixed builds >> - Imported @jrose00 changes https://github.com/openjdk/jdk/pull/18773 > > src/hotspot/share/utilities/istream.hpp line 98: > >> 96: >> 97: Input* _input; // where the input comes from or else nullptr >> 98: IState _input_state; // one of {NTR,EOF,ERR}_STATE > > This comment not necessary as the type describes its valid state. Sometimes redundant comments are helpful. I think this one is. YMMV. > src/hotspot/share/utilities/istream.hpp line 106: > >> 104: size_t _end; // offset to end of known current line (else content_end) >> 105: size_t _next; // offset to known start of next line (else =end) >> 106: void* _must_free; // unless null, a malloc pointer which we must free > > Reading this code, why do we set `_must_free` instead of simply having a method: > > ```c++ > bool must_free() { > return _buffer != &_small_buffer; > } > > > and just delete the `_must_free` field. Good question. There was a version of the code that accepted a user-supplied buffer, optionally. In that case, `_must_free` was set false (or to a user-requested value), since it was up to the user whether the user-supplied buffer should be freed. It could be a static buffer. If there is no longer such an option in the existing constructors, then this field can be GC-ed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576718468 PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576716061 From jrose at openjdk.org Tue Apr 23 18:37:32 2024 From: jrose at openjdk.org (John R Rose) Date: Tue, 23 Apr 2024 18:37:32 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: <4__55RnizjcZwBGgP4QlfXXX6HBzn5jbRn_xrRPE4uM=.994bc41d-4bb3-4b63-b6dc-b533b598d0a6@github.com> References: <4__55RnizjcZwBGgP4QlfXXX6HBzn5jbRn_xrRPE4uM=.994bc41d-4bb3-4b63-b6dc-b533b598d0a6@github.com> Message-ID: <2K-VA9DRH9DAgDL9HB__STvlnE0gSBRjPNU3NLOrZT0=.7ee74867-cf57-4c13-bd54-751425d2793a@github.com> On Tue, 23 Apr 2024 18:31:27 GMT, John R Rose wrote: >> src/hotspot/share/utilities/istream.hpp line 98: >> >>> 96: >>> 97: Input* _input; // where the input comes from or else nullptr >>> 98: IState _input_state; // one of {NTR,EOF,ERR}_STATE >> >> This comment not necessary as the type describes its valid state. > > Sometimes redundant comments are helpful. I think this one is. YMMV. The `override` keyword is nice; thank you. I have already argued against the removal of `set_input`. And `set_input` needs `close`. I think `set_input` is not YAGNI but YIWNI = Yes I will need it. The reply that ?you can just wrap another i-stream around the new i-source? is fallacious because of the performance model of i-stream. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576724065 From sgibbons at openjdk.org Tue Apr 23 19:03:59 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 19:03:59 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v5] In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: > Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18893/files - new: https://git.openjdk.org/jdk/pull/18893/files/fe5f3060..f4db9a1b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18893&range=03-04 Stats: 8 lines in 3 files changed: 3 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18893.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18893/head:pull/18893 PR: https://git.openjdk.org/jdk/pull/18893 From sgibbons at openjdk.org Tue Apr 23 19:03:59 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 19:03:59 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v4] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Tue, 23 Apr 2024 16:18:42 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Comment indentation @sviswa7 Thanks for the good catches. Fixed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2073207547 From pchilanomate at openjdk.org Tue Apr 23 19:28:30 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 23 Apr 2024 19:28:30 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Looks good. Just small suggestion. Thanks, Patricio src/hotspot/share/prims/jvmtiEnvBase.cpp line 2079: > 2077: void > 2078: GetSingleStackTraceClosure::do_vthread(Handle target_h) { > 2079: // use jvmti_vthread() as vthread() can be outdated The only reason I can see of why just using vthread() doesn't work is because of the case where we are in a temporary switch to carrier thread. So maybe change comment to be: "use jvmti_vthread() instead of vthread() as target could?have temporary changed identity to carrier thread (see VirtualThread.switchToCarrierThread)". Same in the other places. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18806#pullrequestreview-2018092388 PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1576768011 From smonteith at openjdk.org Tue Apr 23 19:51:31 2024 From: smonteith at openjdk.org (Stuart Monteith) Date: Tue, 23 Apr 2024 19:51:31 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 03:55:33 GMT, Joshua Zhu wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: > > Add more output for easy debugging once the jtreg test case fails Hello - I have no other comments - looks good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-2073301839 From ascarpino at openjdk.org Tue Apr 23 20:09:30 2024 From: ascarpino at openjdk.org (Anthony Scarpino) Date: Tue, 23 Apr 2024 20:09:30 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3] In-Reply-To: <-64Xlhk6ln43-xTmlv_cvloS-gzDrKMyiPUdPbMNlIM=.2b524654-ca5b-4a7a-a7da-316e99cfea35@github.com> References: <-64Xlhk6ln43-xTmlv_cvloS-gzDrKMyiPUdPbMNlIM=.2b524654-ca5b-4a7a-a7da-316e99cfea35@github.com> Message-ID: On Mon, 15 Apr 2024 22:12:30 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s >> >> Performance, no intrinsic: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57... > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > Comments from Jatin and Tony src/java.base/share/classes/sun/security/ec/ECOperations.java line 204: > 202: * @return the product > 203: */ > 204: public MutablePoint multiply(AffinePoint affineP, byte[] s) { It seems like there could be some combining of both `multiply()`. If `multiply(AffinePoint, ...)` is called, it can call `DefaultMultiplier` with the `affineP`, but internally call the other `multiply(ECPoint, ...)` for the other situations. I'd rather not have two methods doing most of the same code, but different methods. src/java.base/share/classes/sun/security/ec/ECOperations.java line 467: > 465: sealed static abstract class SmallWindowMultiplier implements PointMultiplier > 466: permits DefaultMultiplier, DefaultMontgomeryMultiplier { > 467: private final AffinePoint affineP; I don't think `affineP` needs to be a class variable anymore. It's only used in the constructor src/java.base/share/classes/sun/security/ec/ECOperations.java line 592: > 590: } > 591: > 592: private final ProjectivePoint.Immutable[][] points; Can you define this at the top please. src/java.base/share/classes/sun/security/ec/ECOperations.java line 668: > 666: } > 667: > 668: private final BigInteger[] base; Can you define this at the top. You use it in the constructor but it's defined later on. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1576821201 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575499019 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575495263 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575491814 From ascarpino at openjdk.org Tue Apr 23 20:09:32 2024 From: ascarpino at openjdk.org (Anthony Scarpino) Date: Tue, 23 Apr 2024 20:09:32 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 2 Apr 2024 19:19:59 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s >> >> Performance, no intrinsic: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units >> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s >> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s >> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s >> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units >> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s >> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s >> Benchmark (isMontBench) Mode Cnt Score Error Units >> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57... > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > remove use of jdk.crypto.ec src/java.base/share/classes/sun/security/ec/ECOperations.java line 308: > 306: > 307: /* > 308: * public Point addition. Used by ECDSAOperations Was the old description not applicable anymore? It would be nice to improve on the existing description that shortening it. src/java.base/share/classes/sun/security/ec/ECOperations.java line 321: > 319: ECOperations ops = this; > 320: if (this.montgomeryOps != null) { > 321: assert p.getField() instanceof IntegerMontgomeryFieldModuloP; This should throw a ProviderException, I believe this would throw an AssertionException ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1556740469 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1558824543 From jrose at openjdk.org Tue Apr 23 20:13:30 2024 From: jrose at openjdk.org (John R Rose) Date: Tue, 23 Apr 2024 20:13:30 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: References: <7_vqERb2aC7gU-sSV3vmxJEgrYd3KuhncMLga4rP6rE=.ff63cadf-6764-459b-8764-4ceb66611ec4@github.com> Message-ID: <_tP6iDbeMlww4do6XB0wQChqRuwxdNeP7j8MeOJqs3I=.d0874645-2fff-4223-99b7-f2fde4b6f02e@github.com> On Tue, 23 Apr 2024 00:25:34 GMT, Ioi Lam wrote: >> This is a nit, but if enum class is used can we then also change to PascalCase for the enum cases? > > Is there an adopted style for the enumerators? I couldn't find it in the hotspot style guide. There's quite a variation today. E.g.. > > > enum class vmSymbolID : int { NO_SID = 0, ....}; > enum class DefaultsLookupMode { find, skip }; According to my interpretation, the relevant style guide passage is about constants: > * Constant names may be upper-case or mixed-case, according to historical necessity. (Note: There are many examples of constants with lowercase names.) Basically, ?styles vary, follow precedent?. I followed the precedent of `JavaThreadStatus` and others, as noted elsewhere in this PR. OS-level constants have old-looking names like `EOF` and `FILENAME_MAX`. That?s also the standard for Java constant names. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1576839392 From sviswanathan at openjdk.org Tue Apr 23 20:22:30 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 23 Apr 2024 20:22:30 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v5] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: <3Fe5my9OAHDHhdUaKfM-0jSO6UbNvxu9p7hVCyLJLtc=.e246103e-769e-46fb-a20c-937338f6017f@github.com> On Tue, 23 Apr 2024 19:03:59 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Review comments Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18893#pullrequestreview-2018229472 From sgibbons at openjdk.org Tue Apr 23 20:22:30 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 20:22:30 GMT Subject: RFR: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 [v5] In-Reply-To: References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Tue, 23 Apr 2024 19:03:59 GMT, Scott Gibbons wrote: >> Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Review comments Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18893#issuecomment-2073375198 From gziemski at openjdk.org Tue Apr 23 20:39:33 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Tue, 23 Apr 2024 20:39:33 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: Message-ID: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> On Tue, 23 Apr 2024 13:44:59 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move TreapNode into Treap This is a partial review. I will work more on it tomorrow. I have few comments, but at this point what stands out to me the most is having to use: `MemoryFileTracker::Instance` Can we remove the `Instance` ? We don't need to use explicit instances in other NMT components. src/hotspot/share/nmt/memReporter.cpp line 914: > 912: MemoryFileTracker::Instance::print_report_on(dev, this->output(), scale()); > 913: } > 914: } Does `devices.length()` and `devices.at(i)` are really needed to be exposed? Can we consider pushing all this inside `MemoryFileTracker`? We make 4 different API calls to MemoryFileTracker here in such a small function. src/hotspot/share/nmt/memTracker.cpp line 71: > 69: if (!MallocTracker::initialize(level) || > 70: !VirtualMemoryTracker::initialize(level) || > 71: !MemoryFileTracker::Instance::initialize(level) || Is there a way to hide the `instance` so that we could do: `!MemoryFileTracker::initialize(level) ` not `!MemoryFileTracker::Instance::initialize(level)` just like the other calls here? The instance is not needed here and just an implementation detail. src/hotspot/share/nmt/memTracker.hpp line 172: > 170: static inline MemoryFileTracker::MemoryFile* register_device(const char* descriptive_name) { > 171: assert_post_init(); > 172: if (!enabled()) return nullptr; Could we push `assert_post_init()` into `enabled()` ? ------------- Changes requested by gziemski (Committer). PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2018212344 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576843510 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576837099 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1576859789 From cjplummer at openjdk.org Tue Apr 23 20:53:28 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 23 Apr 2024 20:53:28 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18806#pullrequestreview-2018277846 From dlong at openjdk.org Tue Apr 23 21:50:29 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 23 Apr 2024 21:50:29 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v6] In-Reply-To: References: <2QpelVQltaWXS_Yf-d0Uuu2j2mtiXoLhb8TJRliA3pk=.282244ed-1907-4362-a19d-4cdf6895af49@github.com> Message-ID: On Tue, 23 Apr 2024 11:46:08 GMT, Axel Boldt-Christmas wrote: >> I like the idea of a flag better, because it is foolproof. Why can't we set it in ObjectSynchronizer::enter? I don't think it matters if there is a safepoint check before that, because the lock stack is still consistent at that point. > > (Currently) the lock stack is always consistent at safepoints w.r.t. what is actually locked. However the lock stack may not be consistent with the most recent lock returned by the leaf `compiledVFrame::monitors()`. > > But you are correct that it can probably be moved to `ObjectSynchronizer::enter` there are no safepoint polls between `SharedRuntime::monitor_enter_helper` and that point. Similarly there are no safepoints polls in the runtime until after `set_current_pending_monitor` is called. > > So with these following assumptions. > 1. LockStack is consistent at safepoints w.r.t. locked monitors > 2. No safepoint polls exist from the point that compiledVFrame::monitors() starts returning the monitorinfo for the currently executing monitorenter until either it calls into the runtime or finishes locking. > > I do not believe 1. is likely to ever change. But I have limited understanding of the validity of 2. nor if it something that can change. > > If both these assumptions are correct than simply skipping the verification when `deoptee_thread->current_pending_monitor() != nullptr` would suffice. I believe those assumptions will always hold, but per separate discussions, let's go with what you have. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18782#discussion_r1576944739 From sspitsyn at openjdk.org Tue Apr 23 22:07:28 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 23 Apr 2024 22:07:28 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Thank you for review, Patricio and Chris! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18806#issuecomment-2073550087 From sspitsyn at openjdk.org Tue Apr 23 22:07:29 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 23 Apr 2024 22:07:29 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 19:05:38 GMT, Patricio Chilano Mateo wrote: >> This is a simple fix of three similar asserts. >> The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. >> The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. >> >> I've seen similar issue and already fixed it in this fragment of code: >> >> class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { >> . . . >> void do_vthread(Handle target_h) { >> assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); >> // use jvmti_vthread() as vthread() can be outdated >> assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); >> . . . >> >> The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. >> >> There are three places which need to be fixed the same way: >> - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` >> - `SetForceEarlyReturn::do_vthread(Handle target_h)` >> - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` >> >> Testing: >> - Run mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 2079: > >> 2077: void >> 2078: GetSingleStackTraceClosure::do_vthread(Handle target_h) { >> 2079: // use jvmti_vthread() as vthread() can be outdated > > The only reason I can see of why just using vthread() doesn't work is because of the case where we are in a temporary switch to carrier thread. So maybe change comment to be: "use jvmti_vthread() instead of vthread() as target could?have temporary changed identity to carrier thread (see VirtualThread.switchToCarrierThread)". Same in the other places. Thank you for the suggestion. Will fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1576959637 From sgibbons at openjdk.org Tue Apr 23 23:38:33 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 23 Apr 2024 23:38:33 GMT Subject: Integrated: 8330844: Add aliases for conditional jumps and additional instruction forms for x86 In-Reply-To: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> References: <-wAKj3RvMqUO3iphA6bA34ilTcM9LkZACKco20ppkE0=.a5d31aa7-9423-477e-9a90-749018d2a12d@github.com> Message-ID: On Mon, 22 Apr 2024 16:20:39 GMT, Scott Gibbons wrote: > Adding infrastructure for JDK-8320448. Aliasing conditional jump instructions; adding some x86 instructions. This pull request has now been integrated. Changeset: 7a895552 Author: Scott Gibbons Committer: Sandhya Viswanathan URL: https://git.openjdk.org/jdk/commit/7a895552c8eb9ae19f8d6eb8c35a0393445305fa Stats: 160 lines in 4 files changed: 160 ins; 0 del; 0 mod 8330844: Add aliases for conditional jumps and additional instruction forms for x86 Reviewed-by: kvn, epeter, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/18893 From lmesnik at openjdk.org Tue Apr 23 23:43:29 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 23 Apr 2024 23:43:29 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18806#pullrequestreview-2018506993 From sspitsyn at openjdk.org Wed Apr 24 00:19:27 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 00:19:27 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Thank you for review, Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18806#issuecomment-2073695339 From sspitsyn at openjdk.org Wed Apr 24 00:29:39 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 00:29:39 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed [v2] In-Reply-To: References: Message-ID: <_ThMMAId660RlwembjyrJLnCTa0WPoGt2M7gaFFnGMA=.d6e02633-f859-470a-8748-be77c055750e@github.com> > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: updated same clarifying comment in several spots ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18806/files - new: https://git.openjdk.org/jdk/pull/18806/files/55a8ed10..6ac3b54f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=00-01 Stats: 8 lines in 3 files changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18806.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18806/head:pull/18806 PR: https://git.openjdk.org/jdk/pull/18806 From sspitsyn at openjdk.org Wed Apr 24 02:49:56 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 02:49:56 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed [v3] In-Reply-To: References: Message-ID: <0hQI5q8pztB_S9cIYZf70NkBgV_eFBSrFT5jEdMvtFo=.06cbbdd8-c84f-4d60-819b-a8364af32bb6@github.com> > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - ?Merge? - review: updated same clarifying comment in several spots - add comments explaining that the vthread() can return outdated oop - 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18806/files - new: https://git.openjdk.org/jdk/pull/18806/files/6ac3b54f..c5ad80d0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=01-02 Stats: 52905 lines in 671 files changed: 26069 ins; 23980 del; 2856 mod Patch: https://git.openjdk.org/jdk/pull/18806.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18806/head:pull/18806 PR: https://git.openjdk.org/jdk/pull/18806 From jzhu at openjdk.org Wed Apr 24 03:44:30 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Wed, 24 Apr 2024 03:44:30 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 19:49:12 GMT, Stuart Monteith wrote: >> Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: >> >> Add more output for easy debugging once the jtreg test case fails > > Hello - I have no other comments - looks good. Thank you a lot for the reviews! @stooart-mon @fisk @robcasloz ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-2073964078 From stefank at openjdk.org Wed Apr 24 05:39:35 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 24 Apr 2024 05:39:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. I'm approving this. I think it will be OK to handle the last few comments as separate RFEs. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2018871645 From stefank at openjdk.org Wed Apr 24 05:39:35 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 24 Apr 2024 05:39:35 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Tue, 23 Apr 2024 13:08:27 GMT, Johan Sj?len wrote: >> New info: >> I added the check and it always failed, meaning that no uncommit flag matches with commit flag. >> I will remove the mandatory flag from uncommit, but it hides the issue of non-matching commit-uncommit flags. > > Keeping the flag argument for `os::uncommit_memory` is important as it is equivalent to `reserve`:ing the memory. This makes future work easier, as we don't have to look at the region to figure out what flag it needs to be reserved as. Is there a problem with looking at the region to figure out the flag of the currently committed memory region? On the surface that seems like a reasonable thing to do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1577284106 From jzhu at openjdk.org Wed Apr 24 05:47:36 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Wed, 24 Apr 2024 05:47:36 GMT Subject: Integrated: 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers In-Reply-To: References: Message-ID: On Fri, 23 Feb 2024 08:11:24 GMT, Joshua Zhu wrote: > Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. > Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, > even the use of a floating point may cause the maximum 2048 bits stack occupied. > Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. > > In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 > > > ...... > 0x0000ffff684cfad8: stp x15, x18, [sp, #80] > 0x0000ffff684cfadc: sub sp, sp, #0x100 > 0x0000ffff684cfae0: str z16, [sp] > 0x0000ffff684cfae4: add x1, x13, #0x10 > 0x0000ffff684cfae8: mov x0, x16 > ;; 0xFFFF803F5414 > 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 > 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 > 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfaf8: blr x8 > 0x0000ffff684cfafc: mov x16, x0 > 0x0000ffff684cfb00: ldr z16, [sp] > 0x0000ffff684cfb04: add sp, sp, #0x100 > 0x0000ffff684cfb08: ptrue p7.b > 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] > ...... > > > could be optimized into: > > > ...... > 0x0000ffff684cfa50: stp x15, x18, [sp, #80] > 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() > 0x0000ffff684cfa58: add x1, x13, #0x10 > 0x0000ffff684cfa5c: mov x0, x16 > ;; 0xFFFF7FA942A8 > 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 > 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 > 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfa6c: blr x8 > 0x0000ffff684cfa70: mov x16, x0 > 0x0000ffff684cfa74: ldr d16, [sp], #16 > 0x0000ffff684cfa78: ptrue p7.b > 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] > ...... > > > Besides the above benefit, when we know what size of register is live, > we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. > > Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. This pull request has now been integrated. Changeset: 5c383860 Author: Joshua Zhu Committer: Tobias Hartmann URL: https://git.openjdk.org/jdk/commit/5c3838605d48d7f2db981c5e821c08d84856c53c Stats: 710 lines in 7 files changed: 645 ins; 8 del; 57 mod 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers Reviewed-by: eosterlund, rcastanedalo ------------- PR: https://git.openjdk.org/jdk/pull/17977 From cjplummer at openjdk.org Wed Apr 24 05:48:31 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 24 Apr 2024 05:48:31 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed [v3] In-Reply-To: <0hQI5q8pztB_S9cIYZf70NkBgV_eFBSrFT5jEdMvtFo=.06cbbdd8-c84f-4d60-819b-a8364af32bb6@github.com> References: <0hQI5q8pztB_S9cIYZf70NkBgV_eFBSrFT5jEdMvtFo=.06cbbdd8-c84f-4d60-819b-a8364af32bb6@github.com> Message-ID: On Wed, 24 Apr 2024 02:49:56 GMT, Serguei Spitsyn wrote: >> This is a simple fix of three similar asserts. >> The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. >> The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. >> >> I've seen similar issue and already fixed it in this fragment of code: >> >> class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { >> . . . >> void do_vthread(Handle target_h) { >> assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); >> // use jvmti_vthread() as vthread() can be outdated >> assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); >> . . . >> >> The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. >> >> There are three places which need to be fixed the same way: >> - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` >> - `SetForceEarlyReturn::do_vthread(Handle target_h)` >> - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` >> >> Testing: >> - Run mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - ?Merge? > - review: updated same clarifying comment in several spots > - add comments explaining that the vthread() can return outdated oop > - 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed src/hotspot/share/prims/jvmtiEnvBase.cpp line 2079: > 2077: void > 2078: GetSingleStackTraceClosure::do_vthread(Handle target_h) { > 2079: // Use jvmti_vthread() instead of vthread() as target could have temporary changed Suggestion: // Use jvmti_vthread() instead of vthread() as target could have temporarily changed src/hotspot/share/prims/jvmtiEnvBase.hpp line 509: > 507: void do_vthread(Handle target_h) { > 508: assert(_target_jt != nullptr, "sanity check"); > 509: // Use jvmti_vthread() instead of vthread() as target could have temporary changed Suggestion: // Use jvmti_vthread() instead of vthread() as target could have temporarily changed src/hotspot/share/prims/jvmtiEnvBase.hpp line 531: > 529: void do_vthread(Handle target_h) { > 530: assert(_target_jt != nullptr, "sanity check"); > 531: // Use jvmti_vthread() instead of vthread() as target could have temporary changed Suggestion: // Use jvmti_vthread() instead of vthread() as target could have temporarily changed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1577299642 PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1577299918 PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1577300305 From smonteith at openjdk.org Wed Apr 24 07:20:29 2024 From: smonteith at openjdk.org (Stuart Monteith) Date: Wed, 24 Apr 2024 07:20:29 GMT Subject: RFR: 8330171: Lazy W^X switch implementation In-Reply-To: References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: <_o-DnTvlCXTP8lho_6sJOEwgOBgn5lzYEJno-uCVRqQ=.ffb3f85e-89dd-44b4-b650-6a4ba79ba20d@github.com> On Tue, 23 Apr 2024 15:11:10 GMT, Andrew Haley wrote: >> What about granting `WXWrite` only if the current thread is in `_thread_in_vm`? >> That would be more restrictive and roughly equivalent how it currently works. Likely there are some places then that should be granted `WXWrite` eagerly because they need `WXWrite` without `_thread_in_vm`. E.g. the JIT compiler threads should have `WXWrite` and never `WXExec` (I assume) which should be checked in the signal handler. > >> The patch doesn't protect against native agents, as this is obviously impossible. The current code doesn't do that either. For the bytecode, it doesn't prevent the attacker from abusing unsafe api to modify code cache. However unsafe functions are already considered "safe" and we proactively enable WXWrite as well as move thread to `_thread_in_vm` state (@reinrich). JITed code can't write to the cache either with or without the patch. >> >> I totally get the sense of loss of security. But is this really the case? > > I think it is. W^X is intended (amongst other things) to protect against the use of gadgets, from buffer overflow exploits in non-java code to ROP programming. At present, in order to generate code and execute it, you first have to be able to make the JIT code writable, then write the code, then make it executable. then jump to the code. And the exploit writer might have to do some or all of this by finding gadgets. If we were to merge this patch then all the attacker would have to do is write code to memory and find a way to jump to it, and the automatic switch-on-segfault in this patch would do the all the work the attacker needs. > > It makes far more sense to tag those places that actually need to change W^X access, and only switch there. > > You could argue that any switching of W^X on a write to code space, then switching it back on jumping (or returning) to Java code, even what we already do, is effectively the same thing. Kinda, but it's not on just any attempt to write to code space or any attempt to jump into code, it's at the places we choose, and we can be careful to limit those places. > > But surely the JDK is not the most vulnerable part of the stack anyway? I'd agree with that, of course, but I don't think that's sufficient reason to decide to bypass an OS security mechanism. > > We are trying to reduce the size of the attack surface. To add a little to @theRealAph 's point, we should avoid painting ourselves into a corner. I don't know how the platform is going to evolve, but I'd be nervous about fighting against the intentions of the protections. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18762#issuecomment-2074244082 From ayang at openjdk.org Wed Apr 24 07:29:29 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 24 Apr 2024 07:29:29 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace [v3] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 17:22:41 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. >> >> The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix included header file error after merging master. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18894#pullrequestreview-2019079434 From jsjolen at openjdk.org Wed Apr 24 08:32:42 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 24 Apr 2024 08:32:42 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: <4Wb-5p1RHZCGgGdmEyeOXwSVXyMSYU1QUcUwxT0RO8Q=.63612ef5-d48a-419d-9fb3-1fd35139dc7e@github.com> On Tue, 23 Apr 2024 13:21:49 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 45: > >> 43: void MemoryFileTracker::allocate_memory(MemoryFile* device, size_t offset, >> 44: size_t size, MEMFLAGS flag, >> 45: const NativeCallStack& stack) { > > indentation does not match with the line above. Fixed > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 123: > >> 121: void MemoryFileTracker::Instance::allocate_memory(MemoryFile* device, size_t offset, >> 122: size_t size, MEMFLAGS flag, >> 123: const NativeCallStack& stack) { > > indentation ... Fixed > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 128: > >> 126: >> 127: void MemoryFileTracker::Instance::free_memory(MemoryFile* device, size_t offset, >> 128: size_t size) { > > indentation. Fixed > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 138: > >> 136: >> 137: void MemoryFileTracker::Instance::print_report_on(const MemoryFile* device, >> 138: outputStream* stream, size_t scale) { > > indentation. Fixed > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 152: > >> 150: auto snap = snapshot->by_type(NMTUtil::index_to_flag(i)); >> 151: auto current = device->_summary.by_type(NMTUtil::index_to_flag(i)); >> 152: // PDT stores the memory as reserved but it's accounted as committed. > > What does PDT stand for? Fixed > src/hotspot/share/nmt/nmtMemoryFileTracker.hpp line 49: > >> 47: >> 48: // Each device has its own memory space. >> 49: using DeviceSpace = VMATree; > > `DeviceSpace` is used only 3 times in 2 methods in cpp file. `VMATree` is also used all around the code. > `VMATree` is preferable then. OK > src/hotspot/share/nmt/nmtMemoryFileTracker.hpp line 62: > >> 60: MemoryFile(const char* descriptive_name) >> 61: : _descriptive_name(descriptive_name) { >> 62: } > > can fit into 1 line. Fixed > src/hotspot/share/nmt/nmtTreap.hpp line 26: > >> 24: >> 25: #ifndef SHARE_NMT_TREAP_HPP >> 26: #define SHARE_NMT_TREAP_HPP > > SHARE_NMT_NMTTREAP_HPP Fixed > src/hotspot/share/nmt/vmatree.cpp line 201: > >> 199: }); >> 200: >> 201: AddressState prev = {A, stA}; // stA is just filler > > `AddressState prev{A, stA};` would be like other instances of ctor. I.e., remove the `=` sign. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577496651 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577496492 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577496379 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577495303 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577495232 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577495058 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577498708 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577498619 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1577501137 From thartmann at openjdk.org Wed Apr 24 09:08:43 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 24 Apr 2024 09:08:43 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory [v26] In-Reply-To: References: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> Message-ID: On Sat, 20 Apr 2024 22:31:48 GMT, Scott Gibbons wrote: >> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. >> >> Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: > > - Merge branch 'openjdk:master' into setMemory > - Fix UnsafeCopyMemoryMark scope issue > - Long to short jmp; other cleanup > - Review comments > - Address review comments; update copyright years > - Add enter() and leave(); remove Windows-specific register stuff > - Fix memory mark after sync to upstream > - Merge branch 'openjdk:master' into setMemory > - Set memory test (#23) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > > * Remove MUSL_LIBC ifdef > > * Remove MUSL_LIBC ifdef > - Set memory test (#22) > > * Even more review comments > > * Re-write of atomic copy loops > > * Change name of UnsafeCopyMemory{,Mark} to UnsafeMemory{Access,Mark} > > * Only add a memory mark for byte unaligned fill > - ... and 27 more: https://git.openjdk.org/jdk/compare/6d569961...1122b500 This introduced a regression, see [JDK-8331033](https://bugs.openjdk.org/browse/JDK-8331033). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18555#issuecomment-2074459781 From jsjolen at openjdk.org Wed Apr 24 09:13:37 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 24 Apr 2024 09:13:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <-qHwqRoCF7ddx1FHf5SVmh4fPB6Yx5ZNIufv8a1kREs=.549e1fed-3dad-4094-a3b4-006a10becd7e@github.com> On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. I am also fine with approving this, but I'd like to wait for Thomas's approval before this gets integrated since we're touching so much of metaspace. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2019329390 From jsjolen at openjdk.org Wed Apr 24 09:13:38 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 24 Apr 2024 09:13:38 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Wed, 24 Apr 2024 05:35:37 GMT, Stefan Karlsson wrote: >> Keeping the flag argument for `os::uncommit_memory` is important as it is equivalent to `reserve`:ing the memory. This makes future work easier, as we don't have to look at the region to figure out what flag it needs to be reserved as. > > Is there a problem with looking at the region to figure out the flag of the currently committed memory region? On the surface that seems like a reasonable thing to do. It's not a problem per se, it's just nice not to have to :). Not a blocker, to be clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1577558517 From ihse at openjdk.org Wed Apr 24 09:20:34 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 24 Apr 2024 09:20:34 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Fri, 19 Jan 2024 12:08:21 GMT, Julian Waters wrote: >> Julian Waters has updated the pull request incrementally with one additional commit since the last revision: >> >> Require clang 13 in toolchain.m4 > > Should I split the compiler upgrades into a different change and integrate that first? Going off the conversation in this thread it would seem like the compiler upgrade would benefit us a lot more than just having C++17 (The noreturn attribute is one big motivating factor for instance) and it might help if the compiler upgrades were not delayed by the discussion of when to jump to C++17 @TheShermanTanker I suggest you close this PR. If we are going to switch to C++17, it should start by a discussion in the mailing list, not with a PR (the change itself is trivial). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-2074487573 From jsjolen at openjdk.org Wed Apr 24 09:45:32 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 24 Apr 2024 09:45:32 GMT Subject: RFR: 8330532: Improve line-oriented text parsing in HotSpot [v3] In-Reply-To: <2K-VA9DRH9DAgDL9HB__STvlnE0gSBRjPNU3NLOrZT0=.7ee74867-cf57-4c13-bd54-751425d2793a@github.com> References: <4__55RnizjcZwBGgP4QlfXXX6HBzn5jbRn_xrRPE4uM=.994bc41d-4bb3-4b63-b6dc-b533b598d0a6@github.com> <2K-VA9DRH9DAgDL9HB__STvlnE0gSBRjPNU3NLOrZT0=.7ee74867-cf57-4c13-bd54-751425d2793a@github.com> Message-ID: On Tue, 23 Apr 2024 18:34:23 GMT, John R Rose wrote: >> Sometimes redundant comments are helpful. I think this one is. YMMV. > > The `override` keyword is nice; thank you. > > I have already argued against the removal of `set_input`. And `set_input` needs `close`. > > I think `set_input` is not YAGNI but YIWNI = Yes I will need it. The reply that ?you can just wrap another i-stream around the new i-source? is fallacious because of the performance model of i-stream. Sorry, I'm still not on board with the `close` operation and I'm against `set_input` calling `close()` :-). Why is it necessary for the `inputStream` to require a file to be re-opened if the `inputStream` switches from one file to another? To be clear: OK, we want `set_input` because we don't want to allocate two small buffers, that's fine by me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18833#discussion_r1577605487 From dholmes at openjdk.org Wed Apr 24 09:56:29 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Apr 2024 09:56:29 GMT Subject: RFR: 8327885: runtime/Unsafe/InternalErrorTest.java enters endless loop on Alpine aarch64 In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 07:34:11 GMT, Dmitry Cherepanov wrote: > [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163) replaced memset with a for loop on Alpine. This fixed the test on Alpine x86_64 but it enters endless loop on Alpine aarch64. > > The loop causes SIGBUS to be generated and the signal handler continues to the next instruction. As gcc generates strb with auto-increment on aarch64, the increment will be skipped. > > The patch makes the counter volatile to prevent compilers from generating strb with auto-increment. With the patch, the test passes on Alpine aarch64. As a fix for MUSL_LIBC only this is fine. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18262#pullrequestreview-2019425221 From sspitsyn at openjdk.org Wed Apr 24 09:59:56 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 09:59:56 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed [v3] In-Reply-To: References: <0hQI5q8pztB_S9cIYZf70NkBgV_eFBSrFT5jEdMvtFo=.06cbbdd8-c84f-4d60-819b-a8364af32bb6@github.com> Message-ID: On Wed, 24 Apr 2024 05:44:42 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - ?Merge? >> - review: updated same clarifying comment in several spots >> - add comments explaining that the vthread() can return outdated oop >> - 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 2079: > >> 2077: void >> 2078: GetSingleStackTraceClosure::do_vthread(Handle target_h) { >> 2079: // Use jvmti_vthread() instead of vthread() as target could have temporary changed > > Suggestion: > > // Use jvmti_vthread() instead of vthread() as target could have temporarily changed Good catch, fixed now. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 509: > >> 507: void do_vthread(Handle target_h) { >> 508: assert(_target_jt != nullptr, "sanity check"); >> 509: // Use jvmti_vthread() instead of vthread() as target could have temporary changed > > Suggestion: > > // Use jvmti_vthread() instead of vthread() as target could have temporarily changed Good catch, thanks. Fixed now. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 531: > >> 529: void do_vthread(Handle target_h) { >> 530: assert(_target_jt != nullptr, "sanity check"); >> 531: // Use jvmti_vthread() instead of vthread() as target could have temporary changed > > Suggestion: > > // Use jvmti_vthread() instead of vthread() as target could have temporarily changed Good catch, thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1577621820 PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1577623055 PR Review Comment: https://git.openjdk.org/jdk/pull/18806#discussion_r1577622556 From sspitsyn at openjdk.org Wed Apr 24 09:59:54 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 09:59:54 GMT Subject: RFR: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed [v4] In-Reply-To: References: Message-ID: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: fixed typo in same comment in several spots ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18806/files - new: https://git.openjdk.org/jdk/pull/18806/files/c5ad80d0..643c3046 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18806&range=02-03 Stats: 4 lines in 3 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18806.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18806/head:pull/18806 PR: https://git.openjdk.org/jdk/pull/18806 From azafari at openjdk.org Wed Apr 24 10:00:37 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 24 Apr 2024 10:00:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 16:46:39 GMT, Thomas Stuefe wrote: >> Should I create a RFE for it? > > Sure, go ahead. You can assign this to me, if you want. This is created for this issue: https://bugs.openjdk.org/browse/JDK-8331039 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1577625047 From dholmes at openjdk.org Wed Apr 24 10:25:41 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 24 Apr 2024 10:25:41 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 23 Apr 2024 08:46:42 GMT, Afshin Zafari wrote: >> src/hotspot/share/cds/metaspaceShared.cpp line 1332: >> >>> 1330: // NMT: fix up the space tags >>> 1331: MemTracker::record_virtual_memory_type(archive_space_rs.base(), mtClassShared); >>> 1332: MemTracker::record_virtual_memory_type(class_space_rs.base(), mtClass); >> >> I assumed these (and others) were removed because the `MemTracker` updates had been pushed down into `ReserveSpace` itself, but I can't find them there - what am I missing? > > `archive_space_rs` and `class_space_rs` pass the MEMFLAGS to the `ReservedSpace` ctors a few lines above at 1272, 1319 and 1321. Yes I see the flags being passed to `ReservedSpace` but I don't see where `ReservedSpace` then calls `record_virtual_memory_type`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1577656637 From dnsimon at openjdk.org Wed Apr 24 10:50:44 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Apr 2024 10:50:44 GMT Subject: RFR: 8330755: ProblemList files have entries referring to non-existent tests [v2] In-Reply-To: <5vZvc83Zn4IhI5s_IdYqRqw4zjWF93TcQUzl2cD5JLU=.12464c13-9ccc-47d8-851e-883f3fea4a04@github.com> References: <5vZvc83Zn4IhI5s_IdYqRqw4zjWF93TcQUzl2cD5JLU=.12464c13-9ccc-47d8-851e-883f3fea4a04@github.com> Message-ID: > This PR adds a check for the format of ProblemList files and ensures they only have entries referring to existing tests. > > The cleanups in the second commit of this PR were done based on the output of `CheckProblemLists`: > >> make test TEST=build/problemLists/CheckProblemLists.java > ... > STDOUT: > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-Virtual.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-Xcomp.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-generational-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jaxp/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Xcomp.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-generational-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/langtools/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/lib-test/ProblemList.txt > Checked 13 problem list files > Test roots: > /Users/dnsimon/dev/jdk-jdk/open/test/jdk > /Users/dnsimon/dev/jdk-jdk/open/test/lib-test > /Users/dnsimon/dev/jdk-jdk/open/test/failure_handler/test > /Users/dnsimon/dev/jdk-jdk/open/test/jaxp > /Users/dnsimon/dev/jdk-jdk/open/test/langtools > /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg > Following errors found: > /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList.txt:174: vmTestbase/gc/lock/jni/jnilock002/TestDescription.java does not exist under any test root > vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8192647 generic-all > > /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt:77: TestAndIssue[test=java/util/Properties/StoreReproducibilityTest.java, issueId=0000000] duplicates /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt:76 > java/util/Properties/StoreReproducibilityTest.java 0000000 generic-all > > /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt:516: java/lang/management/MemoryMXBean/PendingAllGC.sh does not exist under any test root > java/lang/management/MemoryMXBean/PendingAllGC.sh 8158837 generic-all > > /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt:667: javax/swing/JFi... Doug Simon has updated the pull request incrementally with one additional commit since the last revision: removed CheckProblemLists.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18879/files - new: https://git.openjdk.org/jdk/pull/18879/files/49a1a58e..22ffae05 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18879&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18879&range=00-01 Stats: 211 lines in 1 file changed: 0 ins; 211 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18879.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18879/head:pull/18879 PR: https://git.openjdk.org/jdk/pull/18879 From dnsimon at openjdk.org Wed Apr 24 10:50:44 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Apr 2024 10:50:44 GMT Subject: RFR: 8330755: ProblemList files have entries referring to non-existent tests In-Reply-To: <5vZvc83Zn4IhI5s_IdYqRqw4zjWF93TcQUzl2cD5JLU=.12464c13-9ccc-47d8-851e-883f3fea4a04@github.com> References: <5vZvc83Zn4IhI5s_IdYqRqw4zjWF93TcQUzl2cD5JLU=.12464c13-9ccc-47d8-851e-883f3fea4a04@github.com> Message-ID: On Sun, 21 Apr 2024 22:00:52 GMT, Doug Simon wrote: > This PR adds a check for the format of ProblemList files and ensures they only have entries referring to existing tests. > > The cleanups in the second commit of this PR were done based on the output of `CheckProblemLists`: > >> make test TEST=build/problemLists/CheckProblemLists.java > ... > STDOUT: > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-Virtual.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-Xcomp.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-generational-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jaxp/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Xcomp.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-generational-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-zgc.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/langtools/ProblemList.txt > Checking /Users/dnsimon/dev/jdk-jdk/open/test/lib-test/ProblemList.txt > Checked 13 problem list files > Test roots: > /Users/dnsimon/dev/jdk-jdk/open/test/jdk > /Users/dnsimon/dev/jdk-jdk/open/test/lib-test > /Users/dnsimon/dev/jdk-jdk/open/test/failure_handler/test > /Users/dnsimon/dev/jdk-jdk/open/test/jaxp > /Users/dnsimon/dev/jdk-jdk/open/test/langtools > /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg > Following errors found: > /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList.txt:174: vmTestbase/gc/lock/jni/jnilock002/TestDescription.java does not exist under any test root > vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8192647 generic-all > > /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt:77: TestAndIssue[test=java/util/Properties/StoreReproducibilityTest.java, issueId=0000000] duplicates /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt:76 > java/util/Properties/StoreReproducibilityTest.java 0000000 generic-all > > /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt:516: java/lang/management/MemoryMXBean/PendingAllGC.sh does not exist under any test root > java/lang/management/MemoryMXBean/PendingAllGC.sh 8158837 generic-all > > /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt:667: javax/swing/JFi... I've removed `CheckProblemLists.java` as it overlaps with https://bugs.openjdk.org/browse/CODETOOLS-7903659. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18879#issuecomment-2074660269 From azafari at openjdk.org Wed Apr 24 11:38:37 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Wed, 24 Apr 2024 11:38:37 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Wed, 24 Apr 2024 10:23:21 GMT, David Holmes wrote: >> `archive_space_rs` and `class_space_rs` pass the MEMFLAGS to the `ReservedSpace` ctors a few lines above at 1272, 1319 and 1321. > > Yes I see the flags being passed to `ReservedSpace` but I don't see where `ReservedSpace` then calls `record_virtual_memory_type`. The `flag` is passed down in the call chain of `ReservedSpace ctor` -> `initialize()` -> `reserve()` -> `reserve_memory()` -> `os::xxx_reserve_memory_yyy()` where is passed to `MemTracker`. There will be no need to specifically call `MemTracker::record_virtual_memory_type(..., flag)` since the flag is already sent to `MemTracker` when reserving the region. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1577737436 From gli at openjdk.org Wed Apr 24 11:44:35 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 24 Apr 2024 11:44:35 GMT Subject: RFR: 8330155: Serial: Remove TenuredSpace [v3] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 17:22:41 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. >> >> The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix included header file error after merging master. Thanks for the reviews. Integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18894#issuecomment-2074744514 From gli at openjdk.org Wed Apr 24 11:44:36 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 24 Apr 2024 11:44:36 GMT Subject: Integrated: 8330155: Serial: Remove TenuredSpace In-Reply-To: References: Message-ID: <2YnKivyjPw42Mzul57zi3XRl8gNWh9SMZFwiLfWEfM8=.c6759245-2395-4794-9313-f65f2b4f6d3d@github.com> On Mon, 22 Apr 2024 16:24:06 GMT, Guoxiong Li wrote: > Hi all, > > This patch removes the class `TenuredSpace` and adjusts its usages. After removing `TenuredSpace`, the file `space.inline.hpp` is empty, so I remove this file and change the included header file to `space.hpp`. > > The test `make test-tier1_gc` passed locally. Thanks for taking the time to review. > > Best Regards, > -- Guoxiong This pull request has now been integrated. Changeset: 2bb5cf5f Author: Guoxiong Li URL: https://git.openjdk.org/jdk/commit/2bb5cf5f33337b2cc40aca3bdd36400dc4af5723 Stats: 162 lines in 21 files changed: 11 ins; 127 del; 24 mod 8330155: Serial: Remove TenuredSpace Reviewed-by: ayang, cjplummer, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/18894 From sspitsyn at openjdk.org Wed Apr 24 11:46:34 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 11:46:34 GMT Subject: Integrated: 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed In-Reply-To: References: Message-ID: On Wed, 17 Apr 2024 00:29:52 GMT, Serguei Spitsyn wrote: > This is a simple fix of three similar asserts. > The `_target_jt->jvmti_vthread()` has to be used instead of `_target_jt->vthread()`. > The `_target_jt->vthread()` can be outdated in some specific contexts as shown in the `hs_err` stack trace. > > I've seen similar issue and already fixed it in this fragment of code: > > class GetCurrentLocationClosure : public JvmtiUnitedHandshakeClosure { > . . . > void do_vthread(Handle target_h) { > assert(_target_jt == nullptr || !_target_jt->is_exiting(), "sanity check"); > // use jvmti_vthread() as vthread() can be outdated > assert(_target_jt == nullptr || _target_jt->jvmti_vthread() == target_h(), "sanity check"); > . . . > > The issue above was fixed by replacing `_target_jt->vthread()` with `_target_jt->jvmti_vthread()`. > > There are three places which need to be fixed the same way: > - `GetSingleStackTraceClosure::do_vthread(Handle target_h)` > - `SetForceEarlyReturn::do_vthread(Handle target_h)` > - `UpdateForPopTopFrameClosure::do_vthread(Handle target_h)` > > Testing: > - Run mach5 tiers 1-6 This pull request has now been integrated. Changeset: 15190816 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/15190816f704f2e8681bc3e2d74832828a574106 Stats: 11 lines in 3 files changed: 7 ins; 0 del; 4 mod 8330303: Crash: assert(_target_jt == nullptr || _target_jt->vthread() == target_h()) failed Reviewed-by: pchilanomate, cjplummer, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/18806 From sjohanss at openjdk.org Wed Apr 24 12:06:44 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Wed, 24 Apr 2024 12:06:44 GMT Subject: RFR: 8330626: ZGC: Windows address space placeholders not managed correctly [v2] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:53:23 GMT, Stefan Karlsson wrote: >> Stefan Johansson has updated the pull request incrementally with two additional commits since the last revision: >> >> - Move GTEST_SKIP to setup function >> - StefanK review 2 > > Looks good. Thanks for finding and fixing this issue! Thanks for the reviews @stefank and @xmas92. Regarding the test comment above, me and Axel spoke offline and we both agree that all the cases listed are covered by the test as is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18912#issuecomment-2074781608 From sjohanss at openjdk.org Wed Apr 24 12:06:45 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Wed, 24 Apr 2024 12:06:45 GMT Subject: Integrated: 8330626: ZGC: Windows address space placeholders not managed correctly In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 11:38:17 GMT, Stefan Johansson wrote: > Please review this fix to correctly manage address space placeholders on Windows. > > **Summary** > On Windows, when using small pages, we use address space placeholders to ensure consistency of the address space. > > When a portion of the address space is mapped these placeholders are replaced by the actual backing and when doing this the size of the placeholder(s) needs to exactly match the size to be backed. For this reason, whenever address space is in use, we split the covering placeholder into multiple `ZGranuleSize` sized placeholders. > > During recent investigations into fragmentation of the ZGC address space, I found that there was a code code path (**currently not in use**) that did not properly manage these placeholders and we could end up in situations where no placeholder was split off when a new chunk of `ZGranuleSize` size was request. The problem is basically an off by one problem in the splitting code and the fix is to avoid this by changing it to first split the covering placeholder into two parts before splitting the part to be used into granules. > > **Testing** > * Manual testing using the included GTest as well as sample applications previously triggering the error case. > * Tier 1-5 Generational ZGC testing (ongoing) This pull request has now been integrated. Changeset: e311ba32 Author: Stefan Johansson URL: https://git.openjdk.org/jdk/commit/e311ba32a517a6389c683c3597d78f66fe52991e Stats: 232 lines in 3 files changed: 223 ins; 0 del; 9 mod 8330626: ZGC: Windows address space placeholders not managed correctly Reviewed-by: stefank, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/18912 From lujaniuk at openjdk.org Wed Apr 24 13:16:28 2024 From: lujaniuk at openjdk.org (Ludvig Janiuk) Date: Wed, 24 Apr 2024 13:16:28 GMT Subject: RFR: 8330755: ProblemList files have entries referring to non-existent tests [v2] In-Reply-To: References: <5vZvc83Zn4IhI5s_IdYqRqw4zjWF93TcQUzl2cD5JLU=.12464c13-9ccc-47d8-851e-883f3fea4a04@github.com> Message-ID: On Wed, 24 Apr 2024 10:50:44 GMT, Doug Simon wrote: >> This PR adds a check for the format of ProblemList files and ensures they only have entries referring to existing tests. >> >> The cleanups in the second commit of this PR were done based on the output of `CheckProblemLists`: >> >>> make test TEST=build/problemLists/CheckProblemLists.java >> ... >> STDOUT: >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-Virtual.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-Xcomp.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-generational-zgc.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList-zgc.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/jaxp/ProblemList.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Xcomp.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-generational-zgc.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-zgc.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/langtools/ProblemList.txt >> Checking /Users/dnsimon/dev/jdk-jdk/open/test/lib-test/ProblemList.txt >> Checked 13 problem list files >> Test roots: >> /Users/dnsimon/dev/jdk-jdk/open/test/jdk >> /Users/dnsimon/dev/jdk-jdk/open/test/lib-test >> /Users/dnsimon/dev/jdk-jdk/open/test/failure_handler/test >> /Users/dnsimon/dev/jdk-jdk/open/test/jaxp >> /Users/dnsimon/dev/jdk-jdk/open/test/langtools >> /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg >> Following errors found: >> /Users/dnsimon/dev/jdk-jdk/open/test/hotspot/jtreg/ProblemList.txt:174: vmTestbase/gc/lock/jni/jnilock002/TestDescription.java does not exist under any test root >> vmTestbase/gc/lock/jni/jnilock002/TestDescription.java 8192647 generic-all >> >> /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt:77: TestAndIssue[test=java/util/Properties/StoreReproducibilityTest.java, issueId=0000000] duplicates /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList-Virtual.txt:76 >> java/util/Properties/StoreReproducibilityTest.java 0000000 generic-all >> >> /Users/dnsimon/dev/jdk-jdk/open/test/jdk/ProblemList.txt:516: java/lang/management/MemoryMXBean/PendingAllGC.sh does not exist under any test root >> java/lang/management/MemoryMXBean/PendingAllGC.sh 8158837 generic-all >> >> ... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > removed CheckProblemLists.java While not a blocker IMO, I'm curious about the issues for the removed lines. Taking the first one as an example, I see it's "unresolved" (JDK-8192647) but the file was removed in JDK-8289764. I don't see any other mentions of "problemlist" in JDK-8192647 so the "problemlist" label should probably also be removed. I think it would be good to just do a check through the other issues and see if any other bookkeeping needs to be done, or if any surprises pop up. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18879#issuecomment-2074921452 From dnsimon at openjdk.org Wed Apr 24 13:28:30 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 24 Apr 2024 13:28:30 GMT Subject: RFR: 8330755: ProblemList files have entries referring to non-existent tests [v2] In-Reply-To: References: <5vZvc83Zn4IhI5s_IdYqRqw4zjWF93TcQUzl2cD5JLU=.12464c13-9ccc-47d8-851e-883f3fea4a04@github.com> Message-ID: On Wed, 24 Apr 2024 13:14:02 GMT, Ludvig Janiuk wrote: > While not a blocker IMO, I'm curious about the issues for the removed lines. Taking the first one as an example, I see it's "unresolved" (JDK-8192647) but the file was removed in JDK-8289764. I don't see any other mentions of "problemlist" in JDK-8192647 so the "problemlist" label should probably also be removed. > > I think it would be good to just do a check through the other issues and see if any other bookkeeping needs to be done, or if any surprises pop up. Ok, I'm pinging people here who git blame associates with some of the removed entries: @walulyai , @lmesnik @kumarabhi006 However, I don't see how removing these entries can cause any problems. Someone who have noticed failing problem listed tests by now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18879#issuecomment-2074947452 From duke at openjdk.org Wed Apr 24 13:39:29 2024 From: duke at openjdk.org (Mikhail Ablakatov) Date: Wed, 24 Apr 2024 13:39:29 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> <2stGKhgwZPG0HXj65IZioZBlOud2FMcTqGe89_ggCzs=.088f733d-f156-4178-8020-0b7b84c8764d@github.com> Message-ID: On Mon, 22 Apr 2024 14:42:40 GMT, Mikhail Ablakatov wrote: > You only need one load, add, and multiply per iteration. > You don't need to add across columns until the end. @theRealAph , I've tried to follow the suggested approach, please find the patch and result in https://github.com/mikabl-arm/jdk/commit/e352f30d89e99417231ae7bb66b325c68a76eef9 . So far I wasn't able to see any performance benefits compared to an implementations from this MR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2074972579 From gziemski at openjdk.org Wed Apr 24 14:29:35 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 24 Apr 2024 14:29:35 GMT Subject: Integrated: 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE In-Reply-To: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> References: <1rubknQG6ntQ32o_dCF64U97R3jfyiyNZOms5-_k14g=.fc79bdea-da14-40cb-a35f-1290ec7e11d7@github.com> Message-ID: On Wed, 17 Apr 2024 16:49:25 GMT, Gerard Ziemski wrote: > This is a 3rd attempt of the same fix: > > 1st one had to be pulled out because of a bug in zsh > 2nd one had a workaround for the bug in zsh, but then uncovered an issue in JWDP (JDK-8324668), which was subsequently fixed. > > Tested with MACH5 tier1-9 with no unique or new failures on macOS This pull request has now been integrated. Changeset: f1d0e715 Author: Gerard Ziemski URL: https://git.openjdk.org/jdk/commit/f1d0e715b67e2ca47b525069d8153abbb33f75b9 Stats: 17 lines in 1 file changed: 9 ins; 0 del; 8 mod 8324577: [REDO] - [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE Reviewed-by: dcubed, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/18821 From gziemski at openjdk.org Wed Apr 24 14:59:37 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Wed, 24 Apr 2024 14:59:37 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v44] In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 14:53:16 GMT, Johan Sj?len wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename to AddressComparator > > Fixed a bug that Afshin found w.r.t. summary accounting. The double-arrow thing really helped out there, so I'm happy with keeping that. @jdksjolen Are you going to update the code in response to Afshin feedback sometime soon? I find it a bit hard to look at the code tagged with so many comments, so if you are thinking about updating it sometime soon, I'd prefer to wait reviewing it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2075150063 From sspitsyn at openjdk.org Wed Apr 24 16:11:50 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 16:11:50 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent Message-ID: This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. The benchmark takes a little bit more than 3 sec without any JVMTI agent: Total: in 3045 ms The benchmark takes more than ~3.2X of the above when executed with the `collect` utility: Creating experiment database test.1.er (Process ID: 25262) ... Picked up JAVA_TOOL_OPTIONS: -agentlib:collector Total: in 9864 ms With the fix in place the overhead of a JVMTI agent is around 1.2X: Creating experiment database test.1.er (Process ID: 26442) ... Picked up JAVA_TOOL_OPTIONS: -agentlib:collector Total: in 3765 ms The most of the overhead is taken by two functions: - `JvmtiVTMSTransitionDisabler::start_VTMS_transition()` - `JvmtiVTMSTransitionDisabler::finish_VTMS_transition()` Oracle Studio Performance Analyzer `err_print utility shows the following performance data for these functions: ``` % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::start_VTMS_transition test.1.er Attr. Total Name CPU sec. % =============== Callers 42.930 50.06 SharedRuntime::notify_jvmti_vthread_mount(oopDesc*, unsigned char, JavaThread*) 21.505 25.08 JvmtiVTMSTransitionDisabler::VTMS_vthread_end(_jobject*) 21.315 24.86 JvmtiVTMSTransitionDisabler::VTMS_vthread_unmount(_jobject*, bool) =============== Stack Fragment 81.407 94.94 JvmtiVTMSTransitionDisabler::start_VTMS_transition(_jobject*, bool) =============== Callees 4.083 4.76 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.140 0.16 __tls_get_addr 0.120 0.14 JNIHandles::resolve_external_guard(_jobject*) % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::finish_VTMS_transition test.1.er Attr. Total Name CPU sec. % =============== Callers 47.363 52.59 SharedRuntime::notify_jvmti_vthread_unmount(oopDesc*, unsigned char, JavaThread*) 21.355 23.71 JvmtiVTMSTransitionDisabler::VTMS_vthread_mount(_jobject*, bool) 21.345 23.70 JvmtiVTMSTransitionDisabler::VTMS_vthread_start(_jobject*) =============== Stack Fragment 64.145 71.22 JvmtiVTMSTransitionDisabler::finish_VTMS_transition(_jobject*, bool) =============== Callees 25.288 28.08 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.240 0.27 __tls_get_addr 0.200 0.22 JavaThread::set_is_in_VTMS_transition(bool) 0.190 0.21 JNIHandles::resolve_external_guard(_jobject*) The main source of this overhead (~90% of overhead) is atomic increment and decrement of the global counter `VTMS_transition_count`: - `Atomic::inc(&_VTMS_transition_count)`; - `Atomic::dec(&_VTMS_transition_count)`; The fix is to replace this global counter with mark bits `_VTMS_transition_mark` distributed over all `JavaThread`'s. If these lines are commented out or replaced with the distributed thread-local marks the main performance overhead is gone: % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::start_VTMS_transition test.2.er Attr. Total Name CPU sec. % ============== Callers 1.801 64.29 SharedRuntime::notify_jvmti_vthread_mount(oopDesc*, unsigned char, JavaThread*) 0.580 20.71 JvmtiVTMSTransitionDisabler::VTMS_vthread_unmount(_jobject*, bool) 0.420 15.00 JvmtiVTMSTransitionDisabler::VTMS_vthread_end(_jobject*) ============== Stack Fragment 0.630 22.50 JvmtiVTMSTransitionDisabler::start_VTMS_transition(_jobject*, bool) ============== Callees 1.931 68.93 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.220 7.86 __tls_get_addr 0.020 0.71 JNIHandles::resolve_external_guard(_jobject*) % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::finish_VTMS_transition test.2.er Attr. Total Name CPU sec. % ============== Callers 1.661 39.15 JvmtiVTMSTransitionDisabler::VTMS_vthread_mount(_jobject*, bool) 1.351 31.84 JvmtiVTMSTransitionDisabler::VTMS_vthread_start(_jobject*) 1.231 29.01 SharedRuntime::notify_jvmti_vthread_unmount(oopDesc*, unsigned char, JavaThread*) ============== Stack Fragment 0.500 11.79 JvmtiVTMSTransitionDisabler::finish_VTMS_transition(_jobject*, bool) ============== Callees 2.972 70.05 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.350 8.25 JavaThread::set_is_in_VTMS_transition(bool) 0.340 8.02 __tls_get_addr 0.080 1.89 JNIHandles::resolve_external_guard(_jobject*) The rest of the overhead (~10% of total overhead) is taken by calls to the function `java_lang_Thread::set_is_in_VTMS_transition()`. The plan is to address this in a separate fix. But it is expected to be a little?bit more tricky. Testing: - Tested with mach5 tiers 1-6 ------------- Commit messages: - 8330969: scalability issue with loaded JVMTI agent Changes: https://git.openjdk.org/jdk/pull/18937/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18937&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330969 Stats: 38 lines in 5 files changed: 14 ins; 11 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18937.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18937/head:pull/18937 PR: https://git.openjdk.org/jdk/pull/18937 From aph at openjdk.org Wed Apr 24 16:33:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 24 Apr 2024 16:33:28 GMT Subject: RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 In-Reply-To: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> References: <2VKOC-rT0vOyMcXUX2gs3sOrbZ5H79KBIo50sOOVmyI=.1936f78e-794c-4f54-af3c-b1b97e5fafa8@github.com> Message-ID: On Tue, 26 Mar 2024 13:59:12 GMT, Mikhail Ablakatov wrote: > Hello, > > Please review the following PR for [JDK-8322770 Implement C2 VectorizedHashCode on AArch64](https://bugs.openjdk.org/browse/JDK-8322770). It follows previous work done in https://github.com/openjdk/jdk/pull/16629 and https://github.com/openjdk/jdk/pull/10847 for RISC-V and x86 respectively. > > The code to calculate a hash code consists of two parts: a vectorized loop of Neon instruction that process 4 or 8 elements per iteration depending on the data type and a fully unrolled scalar "loop" that processes up to 7 tail elements. > > At the time of writing this I don't see potential benefits from providing SVE/SVE2 implementation, but it could be added as a follow-up or independently later if required. > > # Performance > > ## Neoverse N1 > > > -------------------------------------------------------------------------------------------- > Version Baseline This patch > -------------------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > -------------------------------------------------------------------------------------------- > ArraysHashCode.bytes 1 avgt 15 1.249 ? 0.060 1.247 ? 0.062 ns/op > ArraysHashCode.bytes 10 avgt 15 8.754 ? 0.028 4.387 ? 0.015 ns/op > ArraysHashCode.bytes 100 avgt 15 98.596 ? 0.051 26.655 ? 0.097 ns/op > ArraysHashCode.bytes 10000 avgt 15 10150.578 ? 1.352 2649.962 ? 216.744 ns/op > ArraysHashCode.chars 1 avgt 15 1.286 ? 0.062 1.246 ? 0.054 ns/op > ArraysHashCode.chars 10 avgt 15 8.731 ? 0.002 5.344 ? 0.003 ns/op > ArraysHashCode.chars 100 avgt 15 98.632 ? 0.048 23.023 ? 0.142 ns/op > ArraysHashCode.chars 10000 avgt 15 10150.658 ? 3.374 2410.504 ? 8.872 ns/op > ArraysHashCode.ints 1 avgt 15 1.189 ? 0.005 1.187 ? 0.001 ns/op > ArraysHashCode.ints 10 avgt 15 8.730 ? 0.002 5.676 ? 0.001 ns/op > ArraysHashCode.ints 100 avgt 15 98.559 ? 0.016 24.378 ? 0.006 ns/op > ArraysHashCode.ints 10000 avgt 15 10148.752 ? 1.336 2419.015 ? 0.492 ns/op > ArraysHashCode.multibytes 1 avgt 15 1.037 ? 0.001 1.037 ? 0.001 ns/op > ArraysHashCode.multibytes 10 avgt 15 5.4... > > You only need one load, add, and multiply per iteration. > > You don't need to add across columns until the end. > > @theRealAph , I've tried to follow the suggested approach, please find the patch and result in [mikabl-arm at e352f30](https://github.com/mikabl-arm/jdk/commit/e352f30d89e99417231ae7bb66b325c68a76eef9) . > > So far I wasn't able to see any performance benefits compared to an implementations from this MR. Yeah, true. I can see why that's happening from prof perfnorm: 4.30% ? 0x0000ffff70b3cdec: mul v1.4s, v1.4s, v3.4s 0.45% ? 0x0000ffff70b3cdf0: ld1 {v0.4s}, [x1], #16 81.54% ? 0x0000ffff70b3cdf4: add v1.4s, v1.4s, v0.4s 4.83% ? 0x0000ffff70b3cdf8: subs w2, w2, #4 3.55% ? 0x0000ffff70b3cdfc: b.hs #0xffff70b3cdec ArraysHashCode.ints:IPC 1024 avgt 1.395 insns/clk This is 1.4 insns/clk on a machine that can run 8 insns/clk. Because we're doing one load, then the MAC, then another load after the MAC, then a MAC that depends on the load: we stall the whole core waiting for the next load. Everything is serialized. Neoverse looks the same as Apple M1 here. I guess the real question here is what we want. x86's engineers get this: Benchmark (size) Mode Cnt Score Error Units ArraysHashCode.ints 1 avgt 5 0.834 ? 0.001 ns/op ArraysHashCode.ints 10 avgt 5 5.500 ? 0.016 ns/op ArraysHashCode.ints 100 avgt 5 20.330 ? 0.103 ns/op ArraysHashCode.ints 10000 avgt 5 1365.347 ? 1.045 ns/op (And that's on my desktop box from 2018, an inferior piece of hardware.) This is how they do it: ? 0x00007f0634c21c17: imul ebx,r11d 0.02% ? 0x00007f0634c21c1b: vmovdqu ymm12,YMMWORD PTR [rdi+rsi*4] ? 0x00007f0634c21c20: vmovdqu ymm2,YMMWORD PTR [rdi+rsi*4+0x20] 5.36% ? 0x00007f0634c21c26: vmovdqu ymm0,YMMWORD PTR [rdi+rsi*4+0x40] ? 0x00007f0634c21c2c: vmovdqu ymm1,YMMWORD PTR [rdi+rsi*4+0x60] 0.05% ? 0x00007f0634c21c32: vpmulld ymm8,ymm8,ymm3 11.12% ? 0x00007f0634c21c37: vpaddd ymm8,ymm8,ymm12 4.97% ? 0x00007f0634c21c3c: vpmulld ymm9,ymm9,ymm3 15.09% ? 0x00007f0634c21c41: vpaddd ymm9,ymm9,ymm2 5.16% ? 0x00007f0634c21c45: vpmulld ymm10,ymm10,ymm3 15.51% ? 0x00007f0634c21c4a: vpaddd ymm10,ymm10,ymm0 5.44% ? 0x00007f0634c21c4e: vpmulld ymm11,ymm11,ymm3 16.39% ? 0x00007f0634c21c53: vpaddd ymm11,ymm11,ymm1 4.80% ? 0x00007f0634c21c57: add esi,0x20 ? 0x00007f0634c21c5a: cmp esi,ecx ? 0x00007f0634c21c5c: jl 0x00007f0634c21c17 So, do we want to try to beat them on Arm, or not? They surely want to beat Arm. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18487#issuecomment-2075376464 From duke at openjdk.org Wed Apr 24 17:01:59 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 24 Apr 2024 17:01:59 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v4] In-Reply-To: References: Message-ID: > Performance. Before: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s > > Performance, no intrinsic: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply true thrpt 3 1919.574 ? 10.591 ops/s > > Performance, **with intrinsics*... Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: Comments from Tony and Jatin ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18583/files - new: https://git.openjdk.org/jdk/pull/18583/files/6f9ac046..c93a71f0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=02-03 Stats: 48 lines in 2 files changed: 20 ins; 20 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18583/head:pull/18583 PR: https://git.openjdk.org/jdk/pull/18583 From duke at openjdk.org Wed Apr 24 17:01:59 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 24 Apr 2024 17:01:59 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 02:26:57 GMT, Jatin Bhateja wrote: >> Per-above, this is a switch statement (`UNLIKELY`) fallback. I can still add alignment and loop rotation, but being a fallback figured its more important to keep it small&readable... > > It's all part of intrinsic, no harm in polishing it. Done (normalized loop/backedge). There was actually a problem in the loop counter.. (`i-=1` instead of `i-=16`). Can't include a test since classes are sealed, but verified manually. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578172873 From duke at openjdk.org Wed Apr 24 17:02:00 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 24 Apr 2024 17:02:00 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3] In-Reply-To: References: <-64Xlhk6ln43-xTmlv_cvloS-gzDrKMyiPUdPbMNlIM=.2b524654-ca5b-4a7a-a7da-316e99cfea35@github.com> Message-ID: <6lemy0F_PaECRhIOAlCkUCSvGeE8kaAZ7RpqoB1nJeQ=.721e6902-76b3-4112-923f-4dcbeeebb94f@github.com> On Tue, 23 Apr 2024 19:55:57 GMT, Anthony Scarpino wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> Comments from Jatin and Tony > > src/java.base/share/classes/sun/security/ec/ECOperations.java line 204: > >> 202: * @return the product >> 203: */ >> 204: public MutablePoint multiply(AffinePoint affineP, byte[] s) { > > It seems like there could be some combining of both `multiply()`. If `multiply(AffinePoint, ...)` is called, it can call `DefaultMultiplier` with the `affineP`, but internally call the other `multiply(ECPoint, ...)` for the other situations. I'd rather not have two methods doing most of the same code, but different methods. Thanks, they indeed look identical, didnt notice. Fixed. (repeated the same hashmap refactoring and didnt notice I produced identical code twice) > src/java.base/share/classes/sun/security/ec/ECOperations.java line 467: > >> 465: sealed static abstract class SmallWindowMultiplier implements PointMultiplier >> 466: permits DefaultMultiplier, DefaultMontgomeryMultiplier { >> 467: private final AffinePoint affineP; > > I don't think `affineP` needs to be a class variable anymore. It's only used in the constructor Didn't notice, thanks, fixed. > src/java.base/share/classes/sun/security/ec/ECOperations.java line 592: > >> 590: } >> 591: >> 592: private final ProjectivePoint.Immutable[][] points; > > Can you define this at the top please. Done > src/java.base/share/classes/sun/security/ec/ECOperations.java line 668: > >> 666: } >> 667: >> 668: private final BigInteger[] base; > > Can you define this at the top. You use it in the constructor but it's defined later on. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578117929 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578147190 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578148562 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578150303 From duke at openjdk.org Wed Apr 24 17:02:00 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Wed, 24 Apr 2024 17:02:00 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 9 Apr 2024 02:01:36 GMT, Anthony Scarpino wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: >> >> remove use of jdk.crypto.ec > > src/java.base/share/classes/sun/security/ec/ECOperations.java line 308: > >> 306: >> 307: /* >> 308: * public Point addition. Used by ECDSAOperations > > Was the old description not applicable anymore? It would be nice to improve on the existing description that shortening it. Forgot to go back and fix the comment. Fixed.. As for the 'meaning'. Notice the signature of the function changed (i.e. no longer a 'mixed point', but two ProjectivePoints. This is a good idea regardless of Montgomery, but it affects montgomery particularly badly (need to compute zInv for 'no reason'. ) For sake of completeness. Apart from constructor, the 'API' for ECOperations (i.e. as used by ECDHE, ECDSAOperations and KeyGeneration) are these three functions (everything else is used internally by this class) public void setSum(MutablePoint p, MutablePoint p2) public MutablePoint multiply(AffinePoint affineP, byte[] s) public MutablePoint multiply(ECPoint ecPoint, byte[] s) > src/java.base/share/classes/sun/security/ec/ECOperations.java line 321: > >> 319: ECOperations ops = this; >> 320: if (this.montgomeryOps != null) { >> 321: assert p.getField() instanceof IntegerMontgomeryFieldModuloP; > > This should throw a ProviderException, I believe this would throw an AssertionException Missed this comment. No longer applicable (this.montgomeryOps got refactored away) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578144125 PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1578161140 From sspitsyn at openjdk.org Wed Apr 24 18:26:29 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 24 Apr 2024 18:26:29 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent In-Reply-To: References: Message-ID: <8TqEB2KwelGJhLIvqRMWK2nVFlYAqKvoCffkvqacsHc=.88eb3ad9-5a39-4ecf-88c7-afe8e7a69943@github.com> On Wed, 24 Apr 2024 16:04:30 GMT, Serguei Spitsyn wrote: > This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. > For performance analysis, experiments and numbers, please, see the comment below this description. > > The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. > > Testing: > - Tested with mach5 tiers 1-6 The benchmark takes a little bit more than 3 sec without any JVMTI agent: Total: in 3045 ms The benchmark takes more than ~3.2X of the above when executed with the `collect` utility: Creating experiment database test.1.er (Process ID: 25262) ... Picked up JAVA_TOOL_OPTIONS: -agentlib:collector Total: in 9864 ms With the fix in place the overhead of a JVMTI agent is around 1.2X: Creating experiment database test.1.er (Process ID: 26442) ... Picked up JAVA_TOOL_OPTIONS: -agentlib:collector Total: in 3765 ms The most of the overhead is taken by two functions: - `JvmtiVTMSTransitionDisabler::start_VTMS_transition()` - `JvmtiVTMSTransitionDisabler::finish_VTMS_transition()` Oracle Studio Performance Analyzer `err_print` utility shows the following performance data for these functions: ``` % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::start_VTMS_transition test.1.er Attr. Total Name CPU sec. % =============== Callers 42.930 50.06 SharedRuntime::notify_jvmti_vthread_mount(oopDesc*, unsigned char, JavaThread*) 21.505 25.08 JvmtiVTMSTransitionDisabler::VTMS_vthread_end(_jobject*) 21.315 24.86 JvmtiVTMSTransitionDisabler::VTMS_vthread_unmount(_jobject*, bool) =============== Stack Fragment 81.407 94.94 JvmtiVTMSTransitionDisabler::start_VTMS_transition(_jobject*, bool) =============== Callees 4.083 4.76 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.140 0.16 __tls_get_addr 0.120 0.14 JNIHandles::resolve_external_guard(_jobject*) % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::finish_VTMS_transition test.1.er Attr. Total Name CPU sec. % =============== Callers 47.363 52.59 SharedRuntime::notify_jvmti_vthread_unmount(oopDesc*, unsigned char, JavaThread*) 21.355 23.71 JvmtiVTMSTransitionDisabler::VTMS_vthread_mount(_jobject*, bool) 21.345 23.70 JvmtiVTMSTransitionDisabler::VTMS_vthread_start(_jobject*) =============== Stack Fragment 64.145 71.22 JvmtiVTMSTransitionDisabler::finish_VTMS_transition(_jobject*, bool) =============== Callees 25.288 28.08 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.240 0.27 __tls_get_addr 0.200 0.22 JavaThread::set_is_in_VTMS_transition(bool) 0.190 0.21 JNIHandles::resolve_external_guard(_jobject*) The main source of this overhead (~90% of overhead) is atomic increment and decrement of the global counter `VTMS_transition_count`: - `Atomic::inc(&_VTMS_transition_count)`; - `Atomic::dec(&_VTMS_transition_count)`; The fix is to replace this global counter with mark bits `_VTMS_transition_mark` distributed over all `JavaThread`'s. If these lines are commented out or replaced with the distributed thread-local marks the main performance overhead is gone: % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::start_VTMS_transition test.2.er Attr. Total Name CPU sec. % ============== Callers 1.801 64.29 SharedRuntime::notify_jvmti_vthread_mount(oopDesc*, unsigned char, JavaThread*) 0.580 20.71 JvmtiVTMSTransitionDisabler::VTMS_vthread_unmount(_jobject*, bool) 0.420 15.00 JvmtiVTMSTransitionDisabler::VTMS_vthread_end(_jobject*) ============== Stack Fragment 0.630 22.50 JvmtiVTMSTransitionDisabler::start_VTMS_transition(_jobject*, bool) ============== Callees 1.931 68.93 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.220 7.86 __tls_get_addr 0.020 0.71 JNIHandles::resolve_external_guard(_jobject*) % er_print -viewmode expert -metrics ie.%totalcpu -csingle JvmtiVTMSTransitionDisabler::finish_VTMS_transition test.2.er Attr. Total Name CPU sec. % ============== Callers 1.661 39.15 JvmtiVTMSTransitionDisabler::VTMS_vthread_mount(_jobject*, bool) 1.351 31.84 JvmtiVTMSTransitionDisabler::VTMS_vthread_start(_jobject*) 1.231 29.01 SharedRuntime::notify_jvmti_vthread_unmount(oopDesc*, unsigned char, JavaThread*) ============== Stack Fragment 0.500 11.79 JvmtiVTMSTransitionDisabler::finish_VTMS_transition(_jobject*, bool) ============== Callees 2.972 70.05 java_lang_Thread::set_is_in_VTMS_transition(oopDesc*, bool) 0.350 8.25 JavaThread::set_is_in_VTMS_transition(bool) 0.340 8.02 __tls_get_addr 0.080 1.89 JNIHandles::resolve_external_guard(_jobject*) The rest of the overhead (~10% of total overhead) is taken by calls to the function `java_lang_Thread::set_is_in_VTMS_transition()`. The plan is to address this in a separate fix. But it is expected to be a little?bit more tricky. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18937#issuecomment-2075566469 From kevinw at openjdk.org Wed Apr 24 19:32:36 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 24 Apr 2024 19:32:36 GMT Subject: RFR: 8314225: SIGSEGV in JavaThread::is_lock_owned [v2] In-Reply-To: References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Fri, 26 Jan 2024 21:34:44 GMT, Kevin Walls wrote: >> JavaThread's _monitor_chunks member is temporary storage used by deoptimization. >> When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. >> >> There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. > > Kevin Walls has updated the pull request incrementally with one additional commit since the last revision: > > ThreadsListHandle required for Handshake Closing without integrating. Following up with a different approach, will raise separate PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17566#issuecomment-2075688528 From kevinw at openjdk.org Wed Apr 24 19:32:37 2024 From: kevinw at openjdk.org (Kevin Walls) Date: Wed, 24 Apr 2024 19:32:37 GMT Subject: Withdrawn: 8314225: SIGSEGV in JavaThread::is_lock_owned In-Reply-To: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> References: <60li7VMNrwKitU5i3y7_dnQIpTHsJ594rt0f0d-VLiY=.ecb991be-e40d-4182-a82b-9eec718e2d09@github.com> Message-ID: On Thu, 25 Jan 2024 11:04:03 GMT, Kevin Walls wrote: > JavaThread's _monitor_chunks member is temporary storage used by deoptimization. > When other threads inspect it using JavaThread::monitor_chunks(), if it is non-null that means a deoptimization is in progress, and the value will be removed shortly. > > There are a few places where we attempt to follow the MonitorChunk*, but that would only be valid if deopt is in progress, and only safe if we could know the deopt is not going to complete. But that the deopt will complete, and will free the MonitorChunks and clear the value. So this is rare but there is a race and a risk of following a MonitorChunk* as it gets freed, and crashing. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/17566 From dlong at openjdk.org Wed Apr 24 20:11:30 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 24 Apr 2024 20:11:30 GMT Subject: RFR: 8327885: runtime/Unsafe/InternalErrorTest.java enters endless loop on Alpine aarch64 In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 07:34:11 GMT, Dmitry Cherepanov wrote: > [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163) replaced memset with a for loop on Alpine. This fixed the test on Alpine x86_64 but it enters endless loop on Alpine aarch64. > > The loop causes SIGBUS to be generated and the signal handler continues to the next instruction. As gcc generates strb with auto-increment on aarch64, the increment will be skipped. > > The patch makes the counter volatile to prevent compilers from generating strb with auto-increment. With the patch, the test passes on Alpine aarch64. Why not detect `strb` with auto-increment in the signal handler and do the increment there? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18262#issuecomment-2075753004 From asmehra at openjdk.org Wed Apr 24 20:46:52 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Wed, 24 Apr 2024 20:46:52 GMT Subject: RFR: 8330275: Crash in XMark::follow_array Message-ID: This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x ------------- Commit messages: - Fix ppc implementation - 8330275: Crash in XMark::follow_array Changes: https://git.openjdk.org/jdk/pull/18941/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18941&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330275 Stats: 89 lines in 18 files changed: 58 ins; 8 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/18941.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18941/head:pull/18941 PR: https://git.openjdk.org/jdk/pull/18941 From asmehra at openjdk.org Thu Apr 25 01:19:36 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 25 Apr 2024 01:19:36 GMT Subject: RFR: 8330275: Crash in XMark::follow_array In-Reply-To: References: Message-ID: On Wed, 24 Apr 2024 20:22:52 GMT, Ashutosh Mehra wrote: > This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. > Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. > > I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. > > I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. > > For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. > > Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86 I am currently trying to get access to aarch64 system and run the tests test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x. I would appreciate if some one can also test the ppc and riscv changes as I don't have access to such systems. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2076130633 From stefank at openjdk.org Thu Apr 25 06:57:29 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 25 Apr 2024 06:57:29 GMT Subject: RFR: 8330275: Crash in XMark::follow_array In-Reply-To: References: Message-ID: On Wed, 24 Apr 2024 20:22:52 GMT, Ashutosh Mehra wrote: > This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. > Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. This patch sets the maximum address offset bits to 44. > > I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. > > I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. > > For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. > > Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86 Hi @ashu-mehra, Thanks for fixing this issue. There's a number of changes style changes I would like to make to make sure that the code looks more inline with what the rest of the ZGC code looks like. But before we start with that I would like to request that we skip making the changes to marking stack code and limit the changes to only the probing code. Doing so will make it easier to get this fix reviewed and delivered. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18941#pullrequestreview-2021682780 From rehn at openjdk.org Thu Apr 25 07:24:36 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 25 Apr 2024 07:24:36 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps Message-ID: Hi, please consider. We have code that directly use the asm for call/jumps instead masm. Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) j offset jal x0, offset Jump jal offset jal x1, offset Jump and link jr rs jalr x0, rs, 0 Jump register jalr rs jalr x1, rs, 0 Jump and link register ret jalr x0, x1, 0 Return from subroutine call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine But these can only be implemented like this if you have small enough application. The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). We don't have GOT, instead we materialize, so there is still differences between these and ours. This patch: - Tries to follow these suggested mappings as good we can. - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. - I enabled c.j, but right now we never generate it. - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) While looking into our calls it was a bit confusing, this helps. Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) Re-running tests, had some last minute changes. Thanks, Robbin ------------- Commit messages: - Missed a ws - JALR Changes: https://git.openjdk.org/jdk/pull/18942/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18942&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8326306 Stats: 362 lines in 9 files changed: 104 ins; 106 del; 152 mod Patch: https://git.openjdk.org/jdk/pull/18942.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18942/head:pull/18942 PR: https://git.openjdk.org/jdk/pull/18942 From aph at openjdk.org Thu Apr 25 07:52:29 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 25 Apr 2024 07:52:29 GMT Subject: RFR: 8327885: runtime/Unsafe/InternalErrorTest.java enters endless loop on Alpine aarch64 In-Reply-To: References: Message-ID: On Wed, 24 Apr 2024 20:08:49 GMT, Dean Long wrote: > Why not detect `strb` with auto-increment in the signal handler and do the increment there? I was thinking that too, but isn't the root cause of this problem that we're calling out to C++ code at all? We're playing an endless game of whack-a-mole because were trying to predict what a C++ compiler might do. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18262#issuecomment-2076581873 From dlong at openjdk.org Thu Apr 25 08:34:28 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 25 Apr 2024 08:34:28 GMT Subject: RFR: 8327885: runtime/Unsafe/InternalErrorTest.java enters endless loop on Alpine aarch64 In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 07:34:11 GMT, Dmitry Cherepanov wrote: > [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163) replaced memset with a for loop on Alpine. This fixed the test on Alpine x86_64 but it enters endless loop on Alpine aarch64. > > The loop causes SIGBUS to be generated and the signal handler continues to the next instruction. As gcc generates strb with auto-increment on aarch64, the increment will be skipped. > > The patch makes the counter volatile to prevent compilers from generating strb with auto-increment. With the patch, the test passes on Alpine aarch64. Good point. Making the use of stub routines mandatory seems like the best solution, and removes a runtime check in Unsafe_SetMemory0. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18262#issuecomment-2076654055 From mdoerr at openjdk.org Thu Apr 25 09:50:45 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 25 Apr 2024 09:50:45 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v17] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 09:42:22 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well I've filed https://bugs.openjdk.org/browse/JDK-8331117 for PPC64. @bulasevich, @fyang, @amitkumar: You may want to check if it makes sense for your platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2076793215 From jsjolen at openjdk.org Thu Apr 25 10:10:34 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:10:34 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:48:12 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 47: > >> 45: const NativeCallStack& stack) { >> 46: NativeCallStackStorage::StackIndex sidx = _stack_storage.push(stack); >> 47: DeviceSpace::Metadata metadata(sidx, flag); > > Can `Metadata` ctor gets a `NaticeCallStack` instead of an index? StackIndex is not used for the rest of the code. The goal of using `StackIndex` instead of `NativeCallStack` is to be able to not have duplicate NCS:s throughout the data structure. That's why we store it in the hashtable instead. > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 85: > >> 83: NMTUtil::scale_name(scale), >> 84: NMTUtil::flag_to_name(pval.out.metadata().flag)); >> 85: pval.out.metadata().stack_idx.stack().print_on(stream, 4); > > Why hard coded `4`? Is it the depth of stack? It's the indentation it should be printed with. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579218414 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579220684 From jsjolen at openjdk.org Thu Apr 25 10:13:33 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:13:33 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 13:18:18 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/memTracker.hpp line 184: > >> 182: } >> 183: >> 184: static inline void allocate_memory_in(MemoryFileTracker::MemoryFile* device, size_t offset, size_t size, > > invalid args: `nullptr` and `size == 0`. We should add tests for `size == 0`, but in general I don't think that's a case that we should disallow. This allows for more generic code where the caller doesn't have to special-case the size being 0. Checking for `nullptr` is a good idea in these outer functions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579224621 From jsjolen at openjdk.org Thu Apr 25 10:21:36 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:21:36 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 13:03:45 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 85: > >> 83: NMTUtil::scale_name(scale), >> 84: NMTUtil::flag_to_name(pval.out.metadata().flag)); >> 85: pval.out.metadata().stack_idx.stack().print_on(stream, 4); > > Also, if `IntervalChange` has some wrappers we can write: > `pval.out_stack().print_on()` or `pval.out_type()`. Simplified with `pval.out.stack()` > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 124: > >> 122: size_t size, MEMFLAGS flag, >> 123: const NativeCallStack& stack) { >> 124: _tracker->allocate_memory(device, offset, size, flag, stack); > > `_tracker` can be `nullptr` if `initialize` is not called or if it failed to allocate. Maybe `!enabled()` is to be used here to check it. > This also applies for any further use of `_tracker` in the subsequent functions. We should leave these inner functions be and any validation logic should be placed within the `MemTracker` class. > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 153: > >> 151: auto current = device->_summary.by_type(NMTUtil::index_to_flag(i)); >> 152: // PDT stores the memory as reserved but it's accounted as committed. >> 153: snap->commit_memory(current->reserved()); > > `VirtualMemorySnapshot` contains both `reserved` and `committed` amounts. If `current->reserved()` is used for `committed` amount, what is the `reserved` amount then? The `reserved` amount is done through the regular MemTracker interface. To quote the PR: >As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding MEMFLAGS as committed memory. > src/hotspot/share/nmt/nmtTreap.hpp line 285: > >> 283: using TreapCHeap = Treap; >> 284: >> 285: #endif //SHARE_NMT_TREAP_HPP > > SHARE_NMT_NMTTREAP_HPP Fixed. > src/hotspot/share/nmt/vmatree.cpp line 36: > >> 34: MEMFLAGS flag_out() const { >> 35: return state.out.metadata().flag; >> 36: } > > can fit in 1 line. I prefer using a bit more space than having it in 1 line. > src/hotspot/share/nmt/vmatree.cpp line 42: > >> 40: // Motivating example: reserve(0,100, mtNMT); reserve(50,75, mtTest); >> 41: // This will require the 2nd call to know which region the second reserve 'smashes' a hole into for proper summary accounting. >> 42: // LEQ_A is figured out a bit later on, as we need to find it for other purposes anyway. > > Let's have a right margin for this part, at column 85 for example. I do prefer leaving this as it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579228774 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579230473 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579231827 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579232385 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579233292 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579232767 From jsjolen at openjdk.org Thu Apr 25 10:32:36 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:32:36 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 14:21:23 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/vmatree.cpp line 188: > >> 186: // LEQ_A - A - B - GEQ_B >> 187: auto& rescom = diff.flag[NMTUtil::flag_to_index(LEQ_A.flag_out())]; >> 188: if (LEQ_A.state.out.type() == StateType::Reserved) { > > Would be nice to have wrappers that allow us write these as: > `LEQ_A.is_out_reserved()` > or > `LEQ_A.is_out_committed()` I prefer showing the explicit comparison being done, instead of hiding it under a function or method. Just to make it obvious that there's no extra logic being performed. I did shorten the code by adding `out()` and `in()` accessors. > src/hotspot/share/nmt/vmatree.cpp line 242: > >> 240: } >> 241: return diff; >> 242: } > > Would be nice if we can break this function into some smaller sub-functions. It is 200+ line now and little hard to track the logic. Thanks! Sure, I think there are a couple of cases which are actual functions (taking input, producing output, nothing else), those can be converted. > src/hotspot/share/nmt/vmatree.hpp line 56: > >> 54: >> 55: // Each point has some stack and a flag associated with it. >> 56: struct Metadata { > > `State` and `Metadata` are attributes of a Node and not to be in VMATree. Sorry, could you expand on what you mean here? > src/hotspot/share/nmt/vmatree.hpp line 63: > >> 61: : stack_idx(), >> 62: flag(mtNone) { >> 63: } > > can fit in 1 line. Fixed > src/hotspot/share/nmt/vmatree.hpp line 70: > >> 68: static bool equals(const Metadata& a, const Metadata& b) { >> 69: return NativeCallStackStorage::StackIndex::equals(a.stack_idx, b.stack_idx) && >> 70: a.flag == b.flag; > > `a.flag == b.flag` can be left-hand of `&&` to be more efficient. Fixed > src/hotspot/share/nmt/vmatree.hpp line 135: > >> 133: SummaryDiff register_mapping(size_t A, size_t B, StateType state, Metadata& metadata); >> 134: >> 135: SummaryDiff reserve_mapping(size_t from, size_t sz, Metadata& metadata) { > > If we use `reserve_mapping` for `uncommit_memory`, we need to set a `StackIndex` and a `MEMFLAGS` to pass as a `Metadata`. If we use `mtNone` for example, all the uncommitted amount would be accounted for `mtNone`. > Would you please provide a `uncommit_mapping(address, size)` to handle these issues properly? Let's wait with this until we actually port over the `VirtualMemoryTracker` to use `VMATree`. > src/hotspot/share/nmt/vmatree.hpp line 145: > >> 143: SummaryDiff release_mapping(size_t from, size_t sz) { >> 144: Metadata empty; >> 145: return register_mapping(from, from + sz, StateType::Released, empty); > > `return register_mapping(from, from + sz, StateType::Released, Metadata{});` Can't be done, `register_mapping` takes a reference and not a value. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579237985 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579239062 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579246503 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579242029 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579242086 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579243316 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579244295 From jsjolen at openjdk.org Thu Apr 25 10:37:34 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:37:34 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> References: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> Message-ID: On Tue, 23 Apr 2024 20:31:52 GMT, Gerard Ziemski wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Move TreapNode into Treap > > src/hotspot/share/nmt/memTracker.hpp line 172: > >> 170: static inline MemoryFileTracker::MemoryFile* register_device(const char* descriptive_name) { >> 171: assert_post_init(); >> 172: if (!enabled()) return nullptr; > > Could we push `assert_post_init()` into `enabled()` ? That's a discussion that should take place in its own PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579253198 From jsjolen at openjdk.org Thu Apr 25 10:37:36 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:37:36 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 14:54:38 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/vmatree.hpp line 113: > >> 111: }; >> 112: >> 113: using VTreap = TreapNode; > > Why `VTreap` and not `TreapNode`? What does the `V` alone say? Just needed a short name, switched to `TreapNode`. > src/hotspot/share/nmt/vmatree.hpp line 139: > >> 137: } >> 138: >> 139: SummaryDiff commit_mapping(size_t from, size_t sz, Metadata& metadata) { > > `size_t` or `address` for `from`? I've been using `size_t` so far to indicate that we are within some file with some offset. I'm not sure that `address` is ever the right choice for `VMATree` as it is a `uchar*`, indicating that it's a directly dereferencable pointer. It's not a huge deal whether we choose `size_t`, `uintptr_t` or `address` for our internal representation IMHO, as long as the external interface (`MemTracker`) correctly indicates what kind of address is expected. @tstuefe, @gerard-ziemski. This discussion is easily lost in the sea of comments, so pinging you directly here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579250998 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579249971 From rkennke at openjdk.org Thu Apr 25 10:43:51 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Apr 2024 10:43:51 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP Message-ID: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> The implementations of Arrays.equals() in macroAssembler_aarch64.cpp, MacroAssembler::arrays_equals() assumes that the start of arrays is 8-byte-aligned. Since [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) this is no longer the case, at least when running with -CompressedClassPointers (or Lilliput). The effect is that the loops may run over the array end, and if the array is at heap boundary, and that memory is unmapped, then it may crash. The proposed fix aims to always enter the main loop(s) with an aligned address: - When the array base is 8-byte-aligned (default, with +CCP), then compare the array lengths separately, then enter the main loop with the array base. - When the array base is not 8-byte-aligned (-CCP and Lilliput), then enter the loop with the address of the array-length (which is then 8-byte-aligned), and compare array lengths in the main loop, and elide the explicit array lengths comparison. Testing: - [ ] tier1 (+CCP) - [ ] tier1 (-CCP) - [ ] tier2 (+CCP) - [ ] tier2 (-CCP) ------------- Commit messages: - 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP Changes: https://git.openjdk.org/jdk/pull/18948/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331098 Stats: 62 lines in 1 file changed: 46 ins; 0 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/18948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18948/head:pull/18948 PR: https://git.openjdk.org/jdk/pull/18948 From jsjolen at openjdk.org Thu Apr 25 10:48:53 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:48:53 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v50] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Style and simplifications per Afshin ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/c0ddb9ff..dc9741ec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=49 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=48-49 Stats: 83 lines in 13 files changed: 13 ins; 9 del; 61 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Thu Apr 25 10:48:53 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:48:53 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 13:44:59 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Move TreapNode into Treap First pass through working through the most of Afshin's comments. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2076892790 From jsjolen at openjdk.org Thu Apr 25 10:48:53 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 25 Apr 2024 10:48:53 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> References: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> Message-ID: On Tue, 23 Apr 2024 20:15:12 GMT, Gerard Ziemski wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Move TreapNode into Treap > > src/hotspot/share/nmt/memReporter.cpp line 914: > >> 912: MemoryFileTracker::Instance::print_report_on(dev, this->output(), scale()); >> 913: } >> 914: } > > Does `devices.length()` and `devices.at(i)` are really needed to be exposed? > > Can we consider pushing all this inside `MemoryFileTracker`? > > We make 4 different API calls to MemoryFileTracker here in such a small function. Sure, I prefer having all of the respective reporting code close to the classes instead of in the `memReporter.cpp` file. It'd be more in the current style to move the `print_report_on` method from `MemoryFileTracker` into `MemDetailReporter` instead. > src/hotspot/share/nmt/memTracker.cpp line 71: > >> 69: if (!MallocTracker::initialize(level) || >> 70: !VirtualMemoryTracker::initialize(level) || >> 71: !MemoryFileTracker::Instance::initialize(level) || > > Is there a way to hide the `instance` so that we could do: > > `!MemoryFileTracker::initialize(level) > ` > > not > > `!MemoryFileTracker::Instance::initialize(level)` > > just like the other calls here? The instance is not needed here and just an implementation detail. We could invert the relationship such that the outer class is the `AllStatic` class and the inner class is the allocatable class. I'll look into it at a later stage as it's all a big renaming. Personally, I don't mind the `::Instance` nomenclature to indicate that "this is the global instance that we're accessing". As long as we keep away from static, global singletons that we can't make many instances like VirtualMemoryTracker and MallocTracker are written, I'm a happy goose. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579264403 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1579262351 From amitkumar at openjdk.org Thu Apr 25 11:32:41 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 25 Apr 2024 11:32:41 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v17] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 09:48:02 GMT, Martin Doerr wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > I've filed https://bugs.openjdk.org/browse/JDK-8331117 for PPC64. @bulasevich, @fyang, @amitkumar: You may want to check if it makes sense for your platforms. @TheRealMDoerr I guess you pinged wrong Amit ? JBS Issue for s390x: https://bugs.openjdk.org/browse/JDK-8331126 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2076966088 From rkennke at openjdk.org Thu Apr 25 11:50:52 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Apr 2024 11:50:52 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> Message-ID: <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> > The implementations of Arrays.equals() in macroAssembler_aarch64.cpp, MacroAssembler::arrays_equals() assumes that the start of arrays is 8-byte-aligned. Since [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) this is no longer the case, at least when running with -CompressedClassPointers (or Lilliput). The effect is that the loops may run over the array end, and if the array is at heap boundary, and that memory is unmapped, then it may crash. > > The proposed fix aims to always enter the main loop(s) with an aligned address: > - When the array base is 8-byte-aligned (default, with +CCP), then compare the array lengths separately, then enter the main loop with the array base. > - When the array base is not 8-byte-aligned (-CCP and Lilliput), then enter the loop with the address of the array-length (which is then 8-byte-aligned), and compare array lengths in the main loop, and elide the explicit array lengths comparison. > > Testing: > - [ ] tier1 (+CCP) > - [ ] tier1 (-CCP) > - [ ] tier2 (+CCP) > - [ ] tier2 (-CCP) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Remove excess whitespace - Avoid loading cnt2 on paths that don't need it ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18948/files - new: https://git.openjdk.org/jdk/pull/18948/files/a59e11ee..68fe9ca2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=00-01 Stats: 8 lines in 1 file changed: 4 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18948/head:pull/18948 PR: https://git.openjdk.org/jdk/pull/18948 From fyang at openjdk.org Thu Apr 25 12:09:30 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 25 Apr 2024 12:09:30 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 07:17:07 GMT, Robbin Ehn wrote: > Hi, please consider. > > We have code that directly use the asm for call/jumps instead masm. > Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. > Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) > > j offset jal x0, offset Jump > jal offset jal x1, offset Jump and link > jr rs jalr x0, rs, 0 Jump register > jalr rs jalr x1, rs, 0 Jump and link register > ret jalr x0, x1, 0 Return from subroutine > call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine > tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine > > But these can only be implemented like this if you have small enough application. > The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). > We don't have GOT, instead we materialize, so there is still differences between these and ours. > > This patch: > - Tries to follow these suggested mappings as good we can. > - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) > - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. > E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. > - I enabled c.j, but right now we never generate it. > - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) > > I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. > (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) > While looking into our calls it was a bit confusing, this helps. > > Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) > Re-running tests, had some last minute changes. > > Thanks, Robbin Thanks for this cleanup! I am having a look. src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 718: > 716: } > 717: > 718: bool is_32bit_offset_from_codeache(int64_t x) { Seems that this should be named `is_32bit_offset_from_codecache`? ------------- PR Review: https://git.openjdk.org/jdk/pull/18942#pullrequestreview-2022304872 PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1579353336 From rehn at openjdk.org Thu Apr 25 12:12:31 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 25 Apr 2024 12:12:31 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 12:03:44 GMT, Fei Yang wrote: >> Hi, please consider. >> >> We have code that directly use the asm for call/jumps instead masm. >> Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. >> Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) >> >> j offset jal x0, offset Jump >> jal offset jal x1, offset Jump and link >> jr rs jalr x0, rs, 0 Jump register >> jalr rs jalr x1, rs, 0 Jump and link register >> ret jalr x0, x1, 0 Return from subroutine >> call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine >> tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine >> >> But these can only be implemented like this if you have small enough application. >> The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). >> We don't have GOT, instead we materialize, so there is still differences between these and ours. >> >> This patch: >> - Tries to follow these suggested mappings as good we can. >> - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) >> - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. >> E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. >> - I enabled c.j, but right now we never generate it. >> - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) >> >> I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. >> (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) >> While looking into our calls it was a bit confusing, this helps. >> >> Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) >> Re-running tests, had some last minute changes. >> >> Thanks, Robbin > > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 718: > >> 716: } >> 717: >> 718: bool is_32bit_offset_from_codeache(int64_t x) { > > Seems that this should be named `is_32bit_offset_from_codecache`? You are correct ! :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1579360092 From aph at openjdk.org Thu Apr 25 12:14:44 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 25 Apr 2024 12:14:44 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v17] In-Reply-To: References: Message-ID: <-OxTgDUskBPtQkMK3Jx_d-d9Fw-zS-6NqLQ_dNpO4w4=.6f4fc461-a596-4c76-bdaa-f24a5c2f6387@github.com> On Thu, 25 Apr 2024 09:48:02 GMT, Martin Doerr wrote: > I've filed https://bugs.openjdk.org/browse/JDK-8331117 for PPC64. @bulasevich, @fyang, @amitkumar: You may want to check if it makes sense for your platforms. I think it makes sense everywhere. It's even a win on machines without POPCOUNT, which surprised me. Once you have hashed lookups the secondary supers cache doesn't help at all. I want to delete the secondary supers cache soon, because it's an additional unnecessary step. @iwanowww did some measurements (DaCapo, Renaissance, SPECjbb2005, SPECjvm2008 on linux-x64/macos-aarch64), and he saw no significant regressions without secondary supers cache. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2077034709 From mdoerr at openjdk.org Thu Apr 25 12:20:42 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 25 Apr 2024 12:20:42 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: <2ReTrE0inVkfcPNrq6JVrGRkoFuOZsLK6Ir0ZAnd_Kk=.13903f65-b747-4c38-9572-91e132ebd424@github.com> Message-ID: On Tue, 16 Apr 2024 19:03:29 GMT, Vladimir Ivanov wrote: >>> Performance testing results look fine. >> >> I wonder, could you do me a little favour? Please run the performance tests with `-XX:-UseSecondarySuperCache`. Thanks. > >> I wonder, could you do me a little favour? Please run the performance tests with -XX:-UseSecondarySuperCache. Thanks. > > Sure, I'll let you know once the testing is over. > I want to delete the secondary supers cache soon, because it's an additional unnecessary step. @iwanowww did some measurements (DaCapo, Renaissance, SPECjbb2005, SPECjvm2008 on linux-x64/macos-aarch64), and he saw no significant regressions without secondary supers cache. Thanks for the information. So, we should probably wait for that. Is there a JBS issue already? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2077045928 From aph at openjdk.org Thu Apr 25 12:23:43 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 25 Apr 2024 12:23:43 GMT Subject: RFR: 8180450: secondary_super_cache does not scale well [v16] In-Reply-To: References: <2ReTrE0inVkfcPNrq6JVrGRkoFuOZsLK6Ir0ZAnd_Kk=.13903f65-b747-4c38-9572-91e132ebd424@github.com> Message-ID: <7WksdCpwwREEWynQcsAsw07mJfmk2xL5XiXghFaHSKs=.328aa608-0d76-44a8-ac1f-9377631c04d6@github.com> On Tue, 16 Apr 2024 19:03:29 GMT, Vladimir Ivanov wrote: >>> Performance testing results look fine. >> >> I wonder, could you do me a little favour? Please run the performance tests with `-XX:-UseSecondarySuperCache`. Thanks. > >> I wonder, could you do me a little favour? Please run the performance tests with -XX:-UseSecondarySuperCache. Thanks. > > Sure, I'll let you know once the testing is over. > > I want to delete the secondary supers cache soon, because it's an additional unnecessary step. @iwanowww did some measurements (DaCapo, Renaissance, SPECjbb2005, SPECjvm2008 on linux-x64/macos-aarch64), and he saw no significant regressions without secondary supers cache. > > Thanks for the information. So, we should probably wait for that. Is there a JBS issue already? No, don't wait! Every port will benefit from this change, now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2077051699 From tstuefe at redhat.com Thu Apr 25 13:55:14 2024 From: tstuefe at redhat.com (Thomas Stuefe) Date: Thu, 25 Apr 2024 15:55:14 +0200 Subject: Result: New HotSpot Group Member: Andrew Dinn Message-ID: The vote for Andrew Dinn [1] is now closed. Yes: 18 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Thomas Stuefe [1] https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/086877.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmehra at openjdk.org Thu Apr 25 14:25:11 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 25 Apr 2024 14:25:11 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v2] In-Reply-To: References: Message-ID: <6UPX-xX4WyrwsZ5zXst9fL-f-VNEkAy7d8xZuEgWHNU=.3cf2d14f-669a-4a43-847b-525024a61bc7@github.com> > This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. > Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. This patch sets the maximum address offset bits to 44. > > I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. > > I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. > > For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. > > Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86 Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Remove changes to mark stack code Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18941/files - new: https://git.openjdk.org/jdk/pull/18941/files/cb5e457e..27418b6b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18941&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18941&range=00-01 Stats: 51 lines in 9 files changed: 0 ins; 42 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18941.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18941/head:pull/18941 PR: https://git.openjdk.org/jdk/pull/18941 From asmehra at openjdk.org Thu Apr 25 14:28:47 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 25 Apr 2024 14:28:47 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: References: Message-ID: > This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. > Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. This patch sets the maximum address offset bits to 44. > > I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. > > I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. > > For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. > > Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86 Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Fix typos Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18941/files - new: https://git.openjdk.org/jdk/pull/18941/files/27418b6b..4888ce19 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18941&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18941&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18941.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18941/head:pull/18941 PR: https://git.openjdk.org/jdk/pull/18941 From asmehra at openjdk.org Thu Apr 25 14:32:37 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Thu, 25 Apr 2024 14:32:37 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 06:54:24 GMT, Stefan Karlsson wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix typos >> >> Signed-off-by: Ashutosh Mehra > > Hi @ashu-mehra, > > Thanks for fixing this issue. > > There's a number of changes style changes I would like to make to make sure that the code looks more inline with what the rest of the ZGC code looks like. But before we start with that I would like to request that we skip making the changes to marking stack code and limit the changes to only the probing code. Doing so will make it easier to get this fix reviewed and delivered. @stefank I am trying to understand the reason behind your suggestion to remove the changes in marking stack code. Are they not correct or is it that they don't belong to this PR? Anyway I have removed them from this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2077346870 From jesper.wilhelmsson at oracle.com Thu Apr 25 14:48:35 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Thu, 25 Apr 2024 14:48:35 +0000 Subject: Result: New HotSpot Group Member: Afshin Zafari Message-ID: <424E4F93-A2DB-4369-900D-3C8C59A7C505@oracle.com> The vote for Afshin Zafari [1] is now closed. Yes: 9 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. /Jesper [1] https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/086783.html -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From jesper.wilhelmsson at oracle.com Thu Apr 25 14:48:41 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Thu, 25 Apr 2024 14:48:41 +0000 Subject: Result: New HotSpot Group Member: Fredrik Bredberg Message-ID: The vote for Fredrik Bredberg [1] is now closed. Yes: 9 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. /Jesper [1] https://mail.openjdk.org/pipermail/hotspot-dev/2024-April/086784.html -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From duke at openjdk.org Thu Apr 25 15:05:18 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Thu, 25 Apr 2024 15:05:18 GMT Subject: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v5] In-Reply-To: References: Message-ID: <5HhjM9q2E4xtZVQitu5UGkNgdyBbqZbQOwnvJIUIr2U=.778a6d07-fe5d-4c6b-8bdf-353342df5904@github.com> > Performance. Before: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ? 6.491 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ? 4.954 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ? 36.979 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ? 45.487 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ? 26.584 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ? 23.547 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ? 10.970 ops/s > > Performance, no intrinsic: > > Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units > SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ? 42.420 ops/s > SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ? 133.566 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ? 54.071 ops/s > SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ? 35.920 ops/s > Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units > o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ? 29.858 ops/s > o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ? 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score Error Units > PolynomialP256Bench.benchMultiply true thrpt 3 1919.574 ? 10.591 ops/s > > Performance, **with intrinsics*... Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18583/files - new: https://git.openjdk.org/jdk/pull/18583/files/c93a71f0..a1984501 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18583&range=03-04 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18583.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18583/head:pull/18583 PR: https://git.openjdk.org/jdk/pull/18583 From aph at openjdk.org Thu Apr 25 17:54:36 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 25 Apr 2024 17:54:36 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> Message-ID: On Thu, 25 Apr 2024 11:50:52 GMT, Roman Kennke wrote: >> The implementations of Arrays.equals() in macroAssembler_aarch64.cpp, MacroAssembler::arrays_equals() assumes that the start of arrays is 8-byte-aligned. Since [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) this is no longer the case, at least when running with -CompressedClassPointers (or Lilliput). The effect is that the loops may run over the array end, and if the array is at heap boundary, and that memory is unmapped, then it may crash. >> >> The proposed fix aims to always enter the main loop(s) with an aligned address: >> - When the array base is 8-byte-aligned (default, with +CCP), then compare the array lengths separately, then enter the main loop with the array base. >> - When the array base is not 8-byte-aligned (-CCP and Lilliput), then enter the loop with the address of the array-length (which is then 8-byte-aligned), and compare array lengths in the main loop, and elide the explicit array lengths comparison. >> >> Testing: >> - [ ] tier1 (+CCP) >> - [ ] tier1 (-CCP) >> - [ ] tier2 (+CCP) >> - [ ] tier2 (-CCP) > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Remove excess whitespace > - Avoid loading cnt2 on paths that don't need it src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5730: > 5728: // main loop and don't need to compare it > 5729: // explicitely ahead of the loop. > 5730: cmp(cnt2, cnt1); Why do we need this? Surely if the base isn't required to be aligned, then it might be aligned. So why can't we use the not-aligned version in all cases? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1579897359 From rkennke at openjdk.org Thu Apr 25 18:19:42 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 25 Apr 2024 18:19:42 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> Message-ID: On Thu, 25 Apr 2024 17:52:10 GMT, Andrew Haley wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove excess whitespace >> - Avoid loading cnt2 on paths that don't need it > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5730: > >> 5728: // main loop and don't need to compare it >> 5729: // explicitely ahead of the loop. >> 5730: cmp(cnt2, cnt1); > > Why do we need this? Surely if the base isn't required to be aligned, then it might be aligned. So why can't we use the not-aligned version in all cases? The current implementation assumes that the base (first array element) is aligned. In this case, the array length is *not* aligned (8 bytes mark, word, 4 bytes compressed-Klass*, 4 bytes length), that is why in this case we compare the length ahead of the main loop. With uncompressed Klass* (8 bytes mark-word, 8 bytes Klass*, 4 bytes length, ...) or Lilliput (8 bytes mark-word/Klass*, 4 bytes length, ...), the base is only 4-bytes-aligned, but we can start at the length and still enter the main-loop at an 8 bytes aligned address. As a bonus, that also compares the lengths and we can save a few instructions/branches for that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1579936795 From amenkov at openjdk.org Thu Apr 25 21:04:35 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 25 Apr 2024 21:04:35 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent In-Reply-To: References: Message-ID: On Wed, 24 Apr 2024 16:04:30 GMT, Serguei Spitsyn wrote: > This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. > For performance analysis, experiments and numbers, please, see the comment below this description. > > The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. > > Testing: > - Tested with mach5 tiers 1-6 src/hotspot/share/prims/jvmtiEnvBase.cpp line 1638: > 1636: // Iterates over all JavaThread's, counts VTMS transitions and restores > 1637: // jt->jvmti_thread_state() and jt->jvmti_vthread() for VTMS transition protocol. > 1638: void count_transitions_and_correct_jvmti_thread_states() { The method doesn't count anything anymore. Rename it to `correct_jvmti_thread_states()`? Comment needs to be updated too. src/hotspot/share/prims/jvmtiThreadState.cpp line 501: > 499: oop vt = JNIHandles::resolve_external_guard(vthread); > 500: java_lang_Thread::set_is_in_VTMS_transition(vt, false); > 501: assert(thread->VTMS_transition_mark(), "sanity ed_heck"); Suggestion: assert(thread->VTMS_transition_mark(), "sanity check"); src/hotspot/share/runtime/javaThread.hpp line 668: > 666: void toggle_is_disable_suspend() { _is_disable_suspend = !_is_disable_suspend; }; > 667: > 668: bool VTMS_transition_mark() { return Atomic::load(&_VTMS_transition_mark); } Suggestion: bool VTMS_transition_mark() const { return Atomic::load(&_VTMS_transition_mark); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1580096556 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1580101609 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1580108674 From stefank at openjdk.org Fri Apr 26 06:08:32 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 Apr 2024 06:08:32 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: References: Message-ID: <9_-aN2njKgLHXhbLvQ7pNCh_MarXelw7Ca1sjE2uN0E=.87f785c7-7908-4ff8-a467-df97751a7ce8@github.com> On Thu, 25 Apr 2024 06:54:24 GMT, Stefan Karlsson wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix typos >> >> Signed-off-by: Ashutosh Mehra > > Hi @ashu-mehra, > > Thanks for fixing this issue. > > There's a number of changes style changes I would like to make to make sure that the code looks more inline with what the rest of the ZGC code looks like. But before we start with that I would like to request that we skip making the changes to marking stack code and limit the changes to only the probing code. Doing so will make it easier to get this fix reviewed and delivered. > @stefank I am trying to understand the reason behind your suggestion to remove the changes in marking stack code. Are they not correct or is it that they don't belong to this PR? Anyway I have removed them from this PR. To me, it was not bleeding obvious that they were the right thing to do, and given other changes that doesn't follow the grown ZGC coding style, I wanted suggest a way forward for you to get this bug fixed, with less resistance from us ZGC developers/maintainers. That was the reasoning. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2078692598 From fyang at openjdk.org Fri Apr 26 06:42:31 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 26 Apr 2024 06:42:31 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 07:17:07 GMT, Robbin Ehn wrote: > Hi, please consider. > > We have code that directly use the asm for call/jumps instead masm. > Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. > Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) > > j offset jal x0, offset Jump > jal offset jal x1, offset Jump and link > jr rs jalr x0, rs, 0 Jump register > jalr rs jalr x1, rs, 0 Jump and link register > ret jalr x0, x1, 0 Return from subroutine > call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine > tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine > > But these can only be implemented like this if you have small enough application. > The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). > We don't have GOT, instead we materialize, so there is still differences between these and ours. > > This patch: > - Tries to follow these suggested mappings as good we can. > - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) > - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. > E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. > - I enabled c.j, but right now we never generate it. > - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) > > I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. > (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) > While looking into our calls it was a bit confusing, this helps. > > Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) > Re-running tests, had some last minute changes. > > Thanks, Robbin src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp line 303: > 301: target = CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak); > 302: } > 303: __ rt_call(target); Question: does it make sense to replace `call` with `rt_call` when we are invoking the VM code (C++ code)? Here is what I see the difference between the two: `rt_call` emits code (`auipc` or `movptr`) depending on whether the destination could be found in code cache, while `call` depends on `is_32bit_offset_from_codeache`. So it's still possible for `call` to emit the short `auipc` code if not far even when the target is not there in the code cache like this case. But `rt_call` will always emit a long `movptr` sequence for this case, which I think is not good in performance. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580543868 From stuefe at openjdk.org Fri Apr 26 06:57:32 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 26 Apr 2024 06:57:32 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: <9_-aN2njKgLHXhbLvQ7pNCh_MarXelw7Ca1sjE2uN0E=.87f785c7-7908-4ff8-a467-df97751a7ce8@github.com> References: <9_-aN2njKgLHXhbLvQ7pNCh_MarXelw7Ca1sjE2uN0E=.87f785c7-7908-4ff8-a467-df97751a7ce8@github.com> Message-ID: On Fri, 26 Apr 2024 06:06:03 GMT, Stefan Karlsson wrote: > > @stefank I am trying to understand the reason behind your suggestion to remove the changes in marking stack code. Are they not correct or is it that they don't belong to this PR? Anyway I have removed them from this PR. > > To me, it was not bleeding obvious that they were the right thing to do, and given other changes that doesn't follow the grown ZGC coding style, I wanted suggest a way forward for you to get this bug fixed, with less resistance from us ZGC developers/maintainers. That was the reasoning. I agree with Stefan. I would keep the patch as minimal as possible to make it easier to follow the actual error that has been fixed, and to make it easier for backporters to decide what to downport. Code cleanups can happen in a separate RFE. Ashu, are the other platforms actually broken? If yes, which ones? If a platform is not broken, I would defer touching it up to a separate cleanup RFE. Again because of patch clarity. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2078744978 From rehn at openjdk.org Fri Apr 26 07:25:33 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 26 Apr 2024 07:25:33 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 06:38:32 GMT, Fei Yang wrote: >> Hi, please consider. >> >> We have code that directly use the asm for call/jumps instead masm. >> Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. >> Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) >> >> j offset jal x0, offset Jump >> jal offset jal x1, offset Jump and link >> jr rs jalr x0, rs, 0 Jump register >> jalr rs jalr x1, rs, 0 Jump and link register >> ret jalr x0, x1, 0 Return from subroutine >> call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine >> tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine >> >> But these can only be implemented like this if you have small enough application. >> The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). >> We don't have GOT, instead we materialize, so there is still differences between these and ours. >> >> This patch: >> - Tries to follow these suggested mappings as good we can. >> - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) >> - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. >> E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. >> - I enabled c.j, but right now we never generate it. >> - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) >> >> I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. >> (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) >> While looking into our calls it was a bit confusing, this helps. >> >> Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) >> Re-running tests, had some last minute changes. >> >> Thanks, Robbin > > src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp line 303: > >> 301: target = CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak); >> 302: } >> 303: __ rt_call(target); > > Question: does it make sense to replace `call` with `rt_call` when we are invoking the VM code (C++ code)? Here is what I see the difference between the two: `rt_call` emits code (`auipc` or `movptr`) depending on whether the destination could be found in code cache, while `call` depends on `is_32bit_offset_from_codeache`. So it's still possible for `call` to emit the short `auipc` code if not far even when the target is not there in the code cache like this case. But `rt_call` will always emit a long `movptr` sequence for this case, which I think is not good in performance. A couple of point, all calls to VM runtime should use "call_VM_leaf". E.g. ` __ call_VM_leaf(Continuation::freeze_entry(), 2);` AFIACT it is only Shenandoah which calls VM is this 'wrong' way. call_VM_leaf always use mv -> li. - It would be much better to change call_VM_leaf to use auipc is possible. (and fix Shenandoah to use call_VM_leaf) - We can probably remove rt_call, and just have call. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580584271 From sspitsyn at openjdk.org Fri Apr 26 07:42:34 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 26 Apr 2024 07:42:34 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent In-Reply-To: References: Message-ID: <10TDSa0b4qQ2SfJPTvgPsQxK9jq3EKUQgHQTdHaHazg=.a9607741-95ad-40c8-9c15-f38c9cd51769@github.com> On Thu, 25 Apr 2024 20:39:03 GMT, Alex Menkov wrote: >> This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. >> For performance analysis, experiments and numbers, please, see the comment below this description. >> >> The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. >> >> Testing: >> - Tested with mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1638: > >> 1636: // Iterates over all JavaThread's, counts VTMS transitions and restores >> 1637: // jt->jvmti_thread_state() and jt->jvmti_vthread() for VTMS transition protocol. >> 1638: void count_transitions_and_correct_jvmti_thread_states() { > > The method doesn't count anything anymore. > Rename it to `correct_jvmti_thread_states()`? > Comment needs to be updated too. Good suggestion, thanks. Renamed function and corrected the comment. > src/hotspot/share/prims/jvmtiThreadState.cpp line 501: > >> 499: oop vt = JNIHandles::resolve_external_guard(vthread); >> 500: java_lang_Thread::set_is_in_VTMS_transition(vt, false); >> 501: assert(thread->VTMS_transition_mark(), "sanity ed_heck"); > > Suggestion: > > assert(thread->VTMS_transition_mark(), "sanity check"); Thanks. Fixed now. > src/hotspot/share/runtime/javaThread.hpp line 668: > >> 666: void toggle_is_disable_suspend() { _is_disable_suspend = !_is_disable_suspend; }; >> 667: >> 668: bool VTMS_transition_mark() { return Atomic::load(&_VTMS_transition_mark); } > > Suggestion: > > bool VTMS_transition_mark() const { return Atomic::load(&_VTMS_transition_mark); } Good suggestion, thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1580607373 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1580610256 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1580612160 From sspitsyn at openjdk.org Fri Apr 26 07:45:50 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 26 Apr 2024 07:45:50 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v2] In-Reply-To: References: Message-ID: > This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. > For performance analysis, experiments and numbers, please, see the comment below this description. > > The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. > > Testing: > - Tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: fixed minor issues: renamed function, corrected comment, removed typo in assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18937/files - new: https://git.openjdk.org/jdk/pull/18937/files/6e1bf369..03bcfecb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18937&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18937&range=00-01 Stats: 6 lines in 3 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18937.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18937/head:pull/18937 PR: https://git.openjdk.org/jdk/pull/18937 From aph at openjdk.org Fri Apr 26 08:54:37 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 26 Apr 2024 08:54:37 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> Message-ID: On Thu, 25 Apr 2024 18:16:49 GMT, Roman Kennke wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5730: >> >>> 5728: // main loop and don't need to compare it >>> 5729: // explicitely ahead of the loop. >>> 5730: cmp(cnt2, cnt1); >> >> Why do we need this? Surely if the base isn't required to be aligned, then it might be aligned. So why can't we use the not-aligned version in all cases? > > The current implementation assumes that the base (first array element) is aligned. In this case, the array length is *not* aligned (8 bytes mark, word, 4 bytes compressed-Klass*, 4 bytes length), that is why in this case we compare the length ahead of the main loop. With uncompressed Klass* (8 bytes mark-word, 8 bytes Klass*, 4 bytes length, ...) or Lilliput (8 bytes mark-word/Klass*, 4 bytes length, ...), the base is only 4-bytes-aligned, but we can start at the length and still enter the main-loop at an 8 bytes aligned address. As a bonus, that also compares the lengths and we can save a few instructions/branches for that. So what we're saying here is not so much that the base is not aligned, but that the length _is_? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1580703823 From rehn at openjdk.org Fri Apr 26 09:28:56 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 26 Apr 2024 09:28:56 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: References: Message-ID: > Hi, please consider. > > We have code that directly use the asm for call/jumps instead masm. > Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. > Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) > > j offset jal x0, offset Jump > jal offset jal x1, offset Jump and link > jr rs jalr x0, rs, 0 Jump register > jalr rs jalr x1, rs, 0 Jump and link register > ret jalr x0, x1, 0 Return from subroutine > call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine > tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine > > But these can only be implemented like this if you have small enough application. > The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). > We don't have GOT, instead we materialize, so there is still differences between these and ours. > > This patch: > - Tries to follow these suggested mappings as good we can. > - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) > - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. > E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. > - I enabled c.j, but right now we never generate it. > - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) > > I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. > (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) > While looking into our calls it was a bit confusing, this helps. > > Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) > Re-running tests, had some last minute changes. > > Thanks, Robbin Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Corrected method name ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18942/files - new: https://git.openjdk.org/jdk/pull/18942/files/72c3b0bd..31361202 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18942&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18942&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18942.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18942/head:pull/18942 PR: https://git.openjdk.org/jdk/pull/18942 From rehn at openjdk.org Fri Apr 26 09:28:57 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 26 Apr 2024 09:28:57 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: References: Message-ID: <6aLz4m01gnfgeOS-u583VJ5Og3kjojZma5D9tu49oC8=.d41293eb-480e-45b1-ba59-957ff7fda3e0@github.com> On Fri, 26 Apr 2024 07:21:48 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp line 303: >> >>> 301: target = CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak); >>> 302: } >>> 303: __ rt_call(target); >> >> Question: does it make sense to replace `call` with `rt_call` when we are invoking the VM code (C++ code)? Here is what I see the difference between the two: `rt_call` emits code (`auipc` or `movptr`) depending on whether the destination could be found in code cache, while `call` depends on `is_32bit_offset_from_codeache`. So it's still possible for `call` to emit the short `auipc` code if not far even when the target is not there in the code cache like this case. But `rt_call` will always emit a long `movptr` sequence for this case, which I think is not good in performance. > > A couple of point, all calls to VM runtime should use "call_VM_leaf". > E.g. > ` __ call_VM_leaf(Continuation::freeze_entry(), 2);` > AFIACT it is only Shenandoah which calls VM is this 'wrong' way. > > call_VM_leaf always use mv -> li. > > - It would be much better to change call_VM_leaf to use auipc is possible. (and fix Shenandoah to use call_VM_leaf) > - We can probably remove rt_call, and just have call. I found some other places which uses plain calls to leaf, instead of call_VM_leaf. It seems like it's a guess that registers don't need push/pop, I don't think such speculation is good. If it must be there we should have a argument to call_VM_leaf saying we don't want to push/pop. x86 is this case uses the correct: ` __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak), c_rarg0, c_rarg1);` rt_call for non-code-cache do not need relocation, i.e. movptr. So rt_call and call is not simply interchangeable, i.e. you now need relocation (pc relative calls). Yes, we can probably do better here, but as the change is mv/li + jalr to movptr + jalr, there is no regression. So improvement should be done outside of this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580743377 From fyang at openjdk.org Fri Apr 26 09:42:37 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 26 Apr 2024 09:42:37 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: <6aLz4m01gnfgeOS-u583VJ5Og3kjojZma5D9tu49oC8=.d41293eb-480e-45b1-ba59-957ff7fda3e0@github.com> References: <6aLz4m01gnfgeOS-u583VJ5Og3kjojZma5D9tu49oC8=.d41293eb-480e-45b1-ba59-957ff7fda3e0@github.com> Message-ID: <9DHbmT7q6SAKVoyODgVJTsUU3gfI70Yfl0o4UgYSkhI=.1e5f84f0-a424-4637-904e-7fc398869224@github.com> On Fri, 26 Apr 2024 09:24:09 GMT, Robbin Ehn wrote: >> A couple of point, all calls to VM runtime should use "call_VM_leaf". >> E.g. >> ` __ call_VM_leaf(Continuation::freeze_entry(), 2);` >> AFIACT it is only Shenandoah which calls VM is this 'wrong' way. >> >> call_VM_leaf always use mv -> li. >> >> - It would be much better to change call_VM_leaf to use auipc is possible. (and fix Shenandoah to use call_VM_leaf) >> - We can probably remove rt_call, and just have call. > > I found some other places which uses plain calls to leaf, instead of call_VM_leaf. > It seems like it's a guess that registers don't need push/pop, I don't think such speculation is good. > If it must be there we should have a argument to call_VM_leaf saying we don't want to push/pop. > > x86 is this case uses the correct: > ` __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak), c_rarg0, c_rarg1);` > > rt_call for non-code-cache do not need relocation, i.e. movptr. > So rt_call and call is not simply interchangeable, i.e. you now need relocation (pc relative calls). > > Yes, we can probably do better here, but as the change is mv/li + jalr to movptr + jalr, there is no regression. > So improvement should be done outside of this PR. Another difference is that `rt_call` calls `relocate()` which is similar with aarch64's version of `rt_call` which delegates work to `lea` or `adrp` which does similar things [1][2]. I think we should check whether this will make a difference if we want to remove `rt_call`. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/assembler_aarch64.cpp#L141 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5398 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580761143 From rehn at openjdk.org Fri Apr 26 09:42:38 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 26 Apr 2024 09:42:38 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: <9DHbmT7q6SAKVoyODgVJTsUU3gfI70Yfl0o4UgYSkhI=.1e5f84f0-a424-4637-904e-7fc398869224@github.com> References: <6aLz4m01gnfgeOS-u583VJ5Og3kjojZma5D9tu49oC8=.d41293eb-480e-45b1-ba59-957ff7fda3e0@github.com> <9DHbmT7q6SAKVoyODgVJTsUU3gfI70Yfl0o4UgYSkhI=.1e5f84f0-a424-4637-904e-7fc398869224@github.com> Message-ID: On Fri, 26 Apr 2024 09:38:50 GMT, Fei Yang wrote: >> I found some other places which uses plain calls to leaf, instead of call_VM_leaf. >> It seems like it's a guess that registers don't need push/pop, I don't think such speculation is good. >> If it must be there we should have a argument to call_VM_leaf saying we don't want to push/pop. >> >> x86 is this case uses the correct: >> ` __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak), c_rarg0, c_rarg1);` >> >> rt_call for non-code-cache do not need relocation, i.e. movptr. >> So rt_call and call is not simply interchangeable, i.e. you now need relocation (pc relative calls). >> >> Yes, we can probably do better here, but as the change is mv/li + jalr to movptr + jalr, there is no regression. >> So improvement should be done outside of this PR. > > Another difference is that `rt_call` calls `relocate()` which is similar with aarch64's version of `rt_call` which delegates work to `lea` or `adrp` which does similar things [1][2]. I think we should check whether this will make a difference if we want to remove `rt_call`. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/assembler_aarch64.cpp#L141 > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5398 Yes! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580762539 From rkennke at openjdk.org Fri Apr 26 10:05:33 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 26 Apr 2024 10:05:33 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> Message-ID: <5o7rt_EGNySfdgjVjZ9ny4DxQ0fctvYUD_rUkJWEYA8=.b26c137a-1c6c-4ff9-8a5a-1e185cdcfba9@github.com> On Fri, 26 Apr 2024 08:51:28 GMT, Andrew Haley wrote: >> The current implementation assumes that the base (first array element) is aligned. In this case, the array length is *not* aligned (8 bytes mark, word, 4 bytes compressed-Klass*, 4 bytes length), that is why in this case we compare the length ahead of the main loop. With uncompressed Klass* (8 bytes mark-word, 8 bytes Klass*, 4 bytes length, ...) or Lilliput (8 bytes mark-word/Klass*, 4 bytes length, ...), the base is only 4-bytes-aligned, but we can start at the length and still enter the main-loop at an 8 bytes aligned address. As a bonus, that also compares the lengths and we can save a few instructions/branches for that. > > So what we're saying here is not so much that the base is not aligned, but that the length _is_? Yes, exactly. Perhaps makes sense to rename the variable to 'base_is_8aligned' or something similar? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1580796163 From fyang at openjdk.org Fri Apr 26 10:10:33 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 26 Apr 2024 10:10:33 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 09:28:56 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> We have code that directly use the asm for call/jumps instead masm. >> Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. >> Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master) >> >> j offset jal x0, offset Jump >> jal offset jal x1, offset Jump and link >> jr rs jalr x0, rs, 0 Jump register >> jalr rs jalr x1, rs, 0 Jump and link register >> ret jalr x0, x1, 0 Return from subroutine >> call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine >> tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine >> >> But these can only be implemented like this if you have small enough application. >> The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). >> We don't have GOT, instead we materialize, so there is still differences between these and ours. >> >> This patch: >> - Tries to follow these suggested mappings as good we can. >> - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention) >> - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. >> E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'. >> - I enabled c.j, but right now we never generate it. >> - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags)) >> >> I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. >> (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) >> While looking into our calls it was a bit confusing, this helps. >> >> Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) >> Re-running tests, had some last minute changes. >> >> Thanks, Robbin > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Corrected method name src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5456: > 5454: __ mv(c_rarg0, xthread); > 5455: BLOCK_COMMENT("call runtime_entry"); > 5456: __ rt_call(runtime_entry); I agree it's better to use `call_VM_leaf` for the Shenandoah cases. Then what about the changes in this file and templateInterpreterGenerator_riscv.cpp? Any reason to switch to `rt_call`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580801911 From aboldtch at openjdk.org Fri Apr 26 10:31:35 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 Apr 2024 10:31:35 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> Message-ID: On Thu, 25 Apr 2024 11:50:52 GMT, Roman Kennke wrote: >> The implementations of Arrays.equals() in macroAssembler_aarch64.cpp, MacroAssembler::arrays_equals() assumes that the start of arrays is 8-byte-aligned. Since [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) this is no longer the case, at least when running with -CompressedClassPointers (or Lilliput). The effect is that the loops may run over the array end, and if the array is at heap boundary, and that memory is unmapped, then it may crash. >> >> The proposed fix aims to always enter the main loop(s) with an aligned address: >> - When the array base is 8-byte-aligned (default, with +CCP), then compare the array lengths separately, then enter the main loop with the array base. >> - When the array base is not 8-byte-aligned (-CCP and Lilliput), then enter the loop with the address of the array-length (which is then 8-byte-aligned), and compare array lengths in the main loop, and elide the explicit array lengths comparison. >> >> Testing: >> - [ ] tier1 (+CCP) >> - [ ] tier1 (-CCP) >> - [ ] tier2 (+CCP) >> - [ ] tier2 (-CCP) > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Remove excess whitespace > - Avoid loading cnt2 on paths that don't need it src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5553: > 5551: bool is_8aligned = is_aligned(base_offset, BytesPerWord); > 5552: assert(is_aligned(base_offset, BytesPerWord) || is_aligned(length_offset, BytesPerWord), > 5553: "base_offset or length_offset must be 8-byte aligned"); Not that I see this changing. But the correctness of using `length_offset` relies on the length and the payload being consecutive in memory. Should probably assert this. Suggestion: assert(is_aligned(base_offset, BytesPerWord) || is_aligned(length_offset, BytesPerWord), "base_offset or length_offset must be 8-byte aligned"); assert(is_aligned(base_offset, BytesPerWord) || base_offset == length_offset + BytesPerInt, "base_offset must be 8-byte aligned or no padding between base and length"); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1580836583 From rkennke at openjdk.org Fri Apr 26 10:45:59 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 26 Apr 2024 10:45:59 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v3] In-Reply-To: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> Message-ID: > The implementations of Arrays.equals() in macroAssembler_aarch64.cpp, MacroAssembler::arrays_equals() assumes that the start of arrays is 8-byte-aligned. Since [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) this is no longer the case, at least when running with -CompressedClassPointers (or Lilliput). The effect is that the loops may run over the array end, and if the array is at heap boundary, and that memory is unmapped, then it may crash. > > The proposed fix aims to always enter the main loop(s) with an aligned address: > - When the array base is 8-byte-aligned (default, with +CCP), then compare the array lengths separately, then enter the main loop with the array base. > - When the array base is not 8-byte-aligned (-CCP and Lilliput), then enter the loop with the address of the array-length (which is then 8-byte-aligned), and compare array lengths in the main loop, and elide the explicit array lengths comparison. > > Testing: > - [ ] tier1 (+CCP) > - [ ] tier1 (-CCP) > - [ ] tier2 (+CCP) > - [ ] tier2 (-CCP) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Improve asserts Co-authored-by: Axel Boldt-Christmas ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18948/files - new: https://git.openjdk.org/jdk/pull/18948/files/68fe9ca2..9a3793b1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18948/head:pull/18948 PR: https://git.openjdk.org/jdk/pull/18948 From rehn at openjdk.org Fri Apr 26 11:21:35 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 26 Apr 2024 11:21:35 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: References: Message-ID: <1UZeWIQJIEYbPetxWPlhQffyAy4gWXvNiV79i4_3pMQ=.86fb3068-940b-49ea-a2ea-b84a865d4cca@github.com> On Fri, 26 Apr 2024 10:07:40 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> Corrected method name > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5456: > >> 5454: __ mv(c_rarg0, xthread); >> 5455: BLOCK_COMMENT("call runtime_entry"); >> 5456: __ rt_call(runtime_entry); > > I agree it's better to use `call_VM_leaf` for the Shenandoah cases. Then what about the changes in this file and templateInterpreterGenerator_riscv.cpp? Any reason to switch to `rt_call`? Old call(): int32_t offset = 0; mv(temp, dest, offset); // =>li(); jalr(x1, temp, offset); To keep the sites the same (for non-code-cache calls) New rt_call(): movptr(tmp, target.target(), offset); Assembler::jalr(x1, tmp, offset); Same here means absolute calls, no reloc required. So I have tried to keep the calls the same. As you say we can optimize this by using reloc + la(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580883353 From rkennke at openjdk.org Fri Apr 26 11:22:03 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 26 Apr 2024 11:22:03 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v4] In-Reply-To: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> Message-ID: > The implementations of Arrays.equals() in macroAssembler_aarch64.cpp, MacroAssembler::arrays_equals() assumes that the start of arrays is 8-byte-aligned. Since [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) this is no longer the case, at least when running with -CompressedClassPointers (or Lilliput). The effect is that the loops may run over the array end, and if the array is at heap boundary, and that memory is unmapped, then it may crash. > > The proposed fix aims to always enter the main loop(s) with an aligned address: > - When the array base is 8-byte-aligned (default, with +CCP), then compare the array lengths separately, then enter the main loop with the array base. > - When the array base is not 8-byte-aligned (-CCP and Lilliput), then enter the loop with the address of the array-length (which is then 8-byte-aligned), and compare array lengths in the main loop, and elide the explicit array lengths comparison. > > Testing: > - [ ] tier1 (+CCP) > - [ ] tier1 (-CCP) > - [ ] tier2 (+CCP) > - [ ] tier2 (-CCP) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Remove extra whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18948/files - new: https://git.openjdk.org/jdk/pull/18948/files/9a3793b1..cca53b89 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18948&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18948/head:pull/18948 PR: https://git.openjdk.org/jdk/pull/18948 From jsjolen at openjdk.org Fri Apr 26 11:36:42 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Apr 2024 11:36:42 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 10:45:43 GMT, Johan Sj?len wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Move TreapNode into Treap > > First pass through working through the most of Afshin's comments. Thanks! > @jdksjolen Are you going to update the code in response to Afshin feedback sometime soon? > > I find it a bit hard to look at the code tagged with so many comments, so if you are thinking about updating it sometime soon, I'd prefer to wait reviewing it. Hi Gerard, Go to "Files changed", scroll to a comment and press "i", all comments should now be hidden. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2079211829 From stuefe at openjdk.org Fri Apr 26 11:44:42 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 26 Apr 2024 11:44:42 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 10:33:11 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/vmatree.hpp line 139: >> >>> 137: } >>> 138: >>> 139: SummaryDiff commit_mapping(size_t from, size_t sz, Metadata& metadata) { >> >> `size_t` or `address` for `from`? > > I've been using `size_t` so far to indicate that we are within some file with some offset. I'm not sure that `address` is ever the right choice for `VMATree` as it is a `uchar*`, indicating that it's a directly dereferencable pointer. It's not a huge deal whether we choose `size_t`, `uintptr_t` or `address` for our internal representation IMHO, as long as the external interface (`MemTracker`) correctly indicates what kind of address is expected. > > @tstuefe, @gerard-ziemski. This discussion is easily lost in the sea of comments, so pinging you directly here. How about making your own index type? Something that clearly distinguishes it from sizes. Can be a simple typedef. I think address would be wrong. But size_t is also feeling off. I know we use size_t in other places as index or offset, but it still throws me off, I think of size_t as a size, not an offset. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1580908010 From jsjolen at openjdk.org Fri Apr 26 12:16:44 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Apr 2024 12:16:44 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray Message-ID: Hi, This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. Some example code: ```c++ // Before this patch this worked: GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s int& x = arr.at(7); if (x == -1) { x = 2; } assert(arr.at(7) == 2, "this holds"); // but this was forbidden int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& // so we had to do int x = arr.at_grow(9, -1); if (x == -1) { arr.at_put(9, 2); } Thanks. ------------- Commit messages: - Remove semantics changes in find_sorted - Use references more in GA Changes: https://git.openjdk.org/jdk/pull/18975/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331193 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18975/head:pull/18975 PR: https://git.openjdk.org/jdk/pull/18975 From stefank at openjdk.org Fri Apr 26 12:28:32 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 Apr 2024 12:28:32 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 11:58:43 GMT, Johan Sj?len wrote: > Hi, > > This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. > > > Some example code: > ```c++ > // Before this patch this worked: > GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s > int& x = arr.at(7); > if (x == -1) { > x = 2; > } > assert(arr.at(7) == 2, "this holds"); > // but this was forbidden > int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& > // so we had to do > int x = arr.at_grow(9, -1); > if (x == -1) { > arr.at_put(9, 2); > } > > > Thanks. There are two `at` functions, with and without const: E& at(int i) { ... E const& at(int i) const { This change makes the following functions const, but the return values non-const: E& first() const { ... E& top() const { ... E& last() const { I think there should be some consistency w.r.t. between all these functions. ------------- PR Review: https://git.openjdk.org/jdk/pull/18975#pullrequestreview-2024923074 From jsjolen at openjdk.org Fri Apr 26 12:34:34 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Apr 2024 12:34:34 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 12:25:46 GMT, Stefan Karlsson wrote: > There are two `at` functions, with and without const: > > ``` > E& at(int i) { > ... > E const& at(int i) const { > ``` > > This change makes the following functions const, but the return values non-const: > > ``` > E& first() const { > ... > E& top() const { > ... > E& last() const { > ``` > > I think there should be some consistency w.r.t. between all these functions. Yeah, I'm fine with adding the corresponding `const` functions in this PR. According to cppreference it's fine to add on a `const` using `const_cast`, so in this way we avoid copy-pasting function definitions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18975#issuecomment-2079297088 From stefank at openjdk.org Fri Apr 26 12:47:36 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 Apr 2024 12:47:36 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 14:28:47 GMT, Ashutosh Mehra wrote: >> This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. >> Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. This patch sets the maximum address offset bits to 44. >> >> I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. >> >> I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. >> >> For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. >> >> Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86 > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Fix typos > > Signed-off-by: Ashutosh Mehra So, the absolute minimal point-fix would be to change the value 47 to 46, which would be very easy to backport, right? If we still want to make the change that is currently in the PR I would like to tweak the code along the lines of what I've in my branch here: https://github.com/openjdk/jdk/compare/master...stefank:jdk:pr_18941 The extra patch: * Moves the global constants to the file I think they more belong to * Moves all the probe bit handling into `ZPlatformAddressOffsetBits` * Extracts some of the "bit-to-bits" calculations into intermediate constants The last two points where done to (at least for me) see and understand why the various plus and minuses where performed. I didn't touch the PPC code, since it's quite difference and I don't want to risk messing it up. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2079316745 From aph at openjdk.org Fri Apr 26 13:04:20 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 26 Apr 2024 13:04:20 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: <5o7rt_EGNySfdgjVjZ9ny4DxQ0fctvYUD_rUkJWEYA8=.b26c137a-1c6c-4ff9-8a5a-1e185cdcfba9@github.com> References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> <5o7rt_EGNySfdgjVjZ9ny4DxQ0fctvYUD_rUkJWEYA8=.b26c137a-1c6c-4ff9-8a5a-1e185cdcfba9@github.com> Message-ID: On Fri, 26 Apr 2024 10:02:53 GMT, Roman Kennke wrote: > Yes, exactly. Perhaps makes sense to rename the variable to 'base_is_8aligned' or something similar? With a comment. "Either the base or the length is at an 8-byte-aligned address." This boolean tells you which, and because "reasons" we're trying to maintain alignment. I implore you to measure the difference from alignment, and if the read alignment makes little or no difference, don't do this micro-optimization. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1581005798 From rkennke at openjdk.org Fri Apr 26 13:33:08 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 26 Apr 2024 13:33:08 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> <5o7rt_EGNySfdgjVjZ9ny4DxQ0fctvYUD_rUkJWEYA8=.b26c137a-1c6c-4ff9-8a5a-1e185cdcfba9@github.com> Message-ID: On Fri, 26 Apr 2024 13:01:59 GMT, Andrew Haley wrote: >> Yes, exactly. Perhaps makes sense to rename the variable to 'base_is_8aligned' or something similar? > >> Yes, exactly. Perhaps makes sense to rename the variable to 'base_is_8aligned' or something similar? > > With a comment. "Either the base or the length is at an 8-byte-aligned address." This boolean tells you which, and because "reasons" we're trying to maintain alignment. I implore you to measure the difference from alignment, and if the read alignment makes little or no difference, don't do this micro-optimization. This is not about an optimization, but about a correctness issue. The loop(s) have been written under the assumption that they can read full words, which is true if we start at a word boundary. However, if we don't, then we can attempt an (unaligned) read beyond the array, and if that memory is outside of the heap and unmapped, then we would crash. Note that this currently only happens when running with -UseCompressedClassPointers which almost nobody does. We encountered it with Lilliput, which changes array layout in a similar way. Another way to think of the situation is this: The length and the array elements are a contiguous chunk of memory, starting at the array-length. We want to compare all of it, and fail as soon as we encounter a mismatch. And since the array-length may be at an unaligned location, we compare that 'head' explicitly, and then enter the main-loop at an aligned memory location (the first array element), and can do an optimized loop that reads full words. When running with -CCP (or Lilliput), the first array element will be unaligned, though, but when that happens, we know that the array-length is aligned, and we don't have to compare the unaligned head (length) ahead of the main-loop, but can instead enter the main-loop at the aligned address of the array-length field. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1581027474 From fyang at openjdk.org Fri Apr 26 14:10:02 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 26 Apr 2024 14:10:02 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: <1UZeWIQJIEYbPetxWPlhQffyAy4gWXvNiV79i4_3pMQ=.86fb3068-940b-49ea-a2ea-b84a865d4cca@github.com> References: <1UZeWIQJIEYbPetxWPlhQffyAy4gWXvNiV79i4_3pMQ=.86fb3068-940b-49ea-a2ea-b84a865d4cca@github.com> Message-ID: <0gMQgeYKyAzms64-hBIrltqUSfetu3Kczwr7IwLmF18=.8f583ac0-afff-4f1b-985f-a688cd898ae3@github.com> On Fri, 26 Apr 2024 11:18:15 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5456: >> >>> 5454: __ mv(c_rarg0, xthread); >>> 5455: BLOCK_COMMENT("call runtime_entry"); >>> 5456: __ rt_call(runtime_entry); >> >> I agree it's better to use `call_VM_leaf` for the Shenandoah cases. Then what about the changes in this file and templateInterpreterGenerator_riscv.cpp? Any reason to switch to `rt_call`? > > Old call(): > > int32_t offset = 0; > mv(temp, dest, offset); // =>li(); > jalr(x1, temp, offset); > > > To keep the sites the same (for non-code-cache calls) > New rt_call(): > > movptr(tmp, target.target(), offset); > Assembler::jalr(x1, tmp, offset); > > > Same here means absolute calls, no reloc required. > So I have tried to keep the calls the same. As you say we can optimize this by using reloc + la(). Hi, Let me try to understand what you mean. So, are we going to remote the `relocate` call for non-code-cache call at [1] and further improve the `movptr` at [2] making use of `la`? This sounds interesting to me :- ) [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L5031 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L5033 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1581088731 From jsjolen at openjdk.org Fri Apr 26 14:16:21 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Apr 2024 14:16:21 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray [v2] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. > > > Some example code: > ```c++ > // Before this patch this worked: > GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s > int& x = arr.at(7); > if (x == -1) { > x = 2; > } > assert(arr.at(7) == 2, "this holds"); > // but this was forbidden > int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& > // so we had to do > int x = arr.at_grow(9, -1); > if (x == -1) { > arr.at_put(9, 2); > } > > > Thanks. Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Of course should alos exist for first - Introduce at()-equivalent const methods for top and last ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18975/files - new: https://git.openjdk.org/jdk/pull/18975/files/921a3526..3774d481 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=00-01 Stats: 16 lines in 1 file changed: 14 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18975/head:pull/18975 PR: https://git.openjdk.org/jdk/pull/18975 From jsjolen at openjdk.org Fri Apr 26 14:16:22 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Apr 2024 14:16:22 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 11:58:43 GMT, Johan Sj?len wrote: > Hi, > > This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. > > > Some example code: > ```c++ > // Before this patch this worked: > GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s > int& x = arr.at(7); > if (x == -1) { > x = 2; > } > assert(arr.at(7) == 2, "this holds"); > // but this was forbidden > int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& > // so we had to do > int x = arr.at_grow(9, -1); > if (x == -1) { > arr.at_put(9, 2); > } > > > Thanks. We can have equivalent variants for top, last and first but I decided to skip `at_grow` since it affects the public facing-constness of the `GrowableArray`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18975#issuecomment-2079478183 From asmehra at openjdk.org Fri Apr 26 14:23:54 2024 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Fri, 26 Apr 2024 14:23:54 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 14:28:47 GMT, Ashutosh Mehra wrote: >> This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail. >> Encoding of partial array offset in mark stack requires that the address offset be no more than 44 bits. But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. This patch sets the maximum address offset bits to 44. >> >> I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping. >> >> I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44. >> >> For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits. >> >> Testing: test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86 > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Fix typos > > Signed-off-by: Ashutosh Mehra I agree from the point of view of backporting, point-fix is all we need in this PR. @tstuefe As for the other platforms (riscv and ppc), looking at their code they seem to be broken in the same way as aarch64 but then the problem only happens if the user runs with > 1TB heap size with more than 48 addressable bits. Again, in the spirit of "do not touch if it is not broken", I am fine if we restrict the change to just aarch64. @tstuefe @stefank please let me know if you agree with just doing the point-fix to aarch64. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2079493432 From mli at openjdk.org Fri Apr 26 14:25:11 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 26 Apr 2024 14:25:11 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: > Hi, > Can you help to review the patch? > This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). > > Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. > Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. > > Besides of the code changes, one important task is to handle the legal process. > > Thanks! > > ## Performance > NOTE: > * `Src` means implementation in this pr, i.e. without depenency on external sleef. > * `Disabled` means disable intrinsics by `-XX:-UseVectorStubs` > * `system_sleef` means implementation in [previous pr 18294](https://github.com/openjdk/jdk/pull/18294), i.e. build and run jdk with depenency on external sleef. > > Basically, the perf data below shows that > * this implementation has better performance than previous version in [pr 18294](https://github.com/openjdk/jdk/pull/18294), > * and both sleef versions has much better performance compared with non-sleef version. > > |Benchmark |(size)|Src |Units|system_sleef|(system_sleef-Src)/Src|Diabled |(Disable-Src)/Src| > |------------------------------|------|---------|-----|------------|----------------------|---------|-----------------| > |3472:Double128Vector.ACOS |1024 |8546.842 |ns/op|8516.007 |-0.004 |16799.273|0.966 | > |3473:Double128Vector.ASIN |1024 |6864.656 |ns/op|6987.328 |0.018 |16602.442|1.419 | > |3474:Double128Vector.ATAN |1024 |11489.255|ns/op|12261.800 |0.067 |26329.320|1.292 | > |3475:Double128Vector.ATAN2 |1024 |16661.170|ns/op|17234.472 |0.034 |42084.100|1.526 | > |3476:Double128Vector.CBRT |1024 |18999.387|ns/op|20298.458 |0.068 |35998.688|0.895 | > |3477:Double128Vector.COS |1024 |14081.857|ns/op|14846.117 |0.054 |24420.692|0.734 | > |3478:Double128Vector.COSH |1024 |12202.306|ns/op|12237.772 |0.003 |21343.863|0.749 | > |3479:Double128Vector.EXP |1024 |4553.108 |ns/op|4777.638 |0.049 |20155.903|3.427 | > |3480:D... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: remove notes about sleef changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18605/files - new: https://git.openjdk.org/jdk/pull/18605/files/cd70f5a9..cbcd4634 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18605&range=02-03 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18605.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18605/head:pull/18605 PR: https://git.openjdk.org/jdk/pull/18605 From mli at openjdk.org Fri Apr 26 14:25:12 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 26 Apr 2024 14:25:12 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: References: <0cUurmXlMJ_B66Wy1umd2n4r9ve7_Q4WOU0ffMd8s5Y=.bbc93b65-382c-4139-aaec-cb835d94a06e@github.com> Message-ID: On Thu, 11 Apr 2024 10:36:03 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review the patch? >> This pr is based on previous work and discussion in [pr 16234](https://github.com/openjdk/jdk/pull/16234), [pr 18294](https://github.com/openjdk/jdk/pull/18294). >> >> Compared with previous prs, the major change in this pr is to integrate the source of sleef (for the steps, please check `src/jdk.incubator.vector/linux/native/libvectormath/README`), rather than depends on external sleef things (header or lib) at build or run time. >> Besides of this change, also modify the previous changes accordingly, e.g. remove some uncessary files or changes especially in make dir of jdk. >> >> Besides of the code changes, one important task is to handle the legal process. >> >> Thanks! >> >> ## Performance >> NOTE: >> * `Src` means implementation in this pr, i.e. without depenency on external sleef. >> * `Disabled` means disable intrinsics by `-XX:-UseVectorStubs` >> * `system_sleef` means implementation in [previous pr 18294](https://github.com/openjdk/jdk/pull/18294), i.e. build and run jdk with depenency on external sleef. >> >> Basically, the perf data below shows that >> * this implementation has better performance than previous version in [pr 18294](https://github.com/openjdk/jdk/pull/18294), >> * and both sleef versions has much better performance compared with non-sleef version. >> >> |Benchmark |(size)|Src |Units|system_sleef|(system_sleef-Src)/Src|Diabled |(Disable-Src)/Src| >> |------------------------------|------|---------|-----|------------|----------------------|---------|-----------------| >> |3472:Double128Vector.ACOS |1024 |8546.842 |ns/op|8516.007 |-0.004 |16799.273|0.966 | >> |3473:Double128Vector.ASIN |1024 |6864.656 |ns/op|6987.328 |0.018 |16602.442|1.419 | >> |3474:Double128Vector.ATAN |1024 |11489.255|ns/op|12261.800 |0.067 |26329.320|1.292 | >> |3475:Double128Vector.ATAN2 |1024 |16661.170|ns/op|17234.472 |0.034 |42084.100|1.526 | >> |3476:Double128Vector.CBRT |1024 |18999.387|ns/op|20298.458 |0.068 |35998.688|0.895 | >> |3477:Double128Vector.COS |1024 |14081.857|ns/op|14846.117 |0.054 |24420.692|0.734 | >> |3478:Double128Vector.COSH |1024 |12202.306|ns/op|12237.772 |0.003 |21343.863|0.749 | >> |3479:Double128Vector.EXP |1024 |4553.108 |ns/op|4777.638 ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix performance issue Based on these 2 pr (https://github.com/shibatch/sleef/pull/537, https://github.com/shibatch/sleef/pull/536), there is no necessary code change in sleef files anymore. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18605#issuecomment-2079495537 From jsjolen at openjdk.org Fri Apr 26 14:46:08 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 26 Apr 2024 14:46:08 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray [v3] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. > > > Some example code: > ```c++ > // Before this patch this worked: > GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s > int& x = arr.at(7); > if (x == -1) { > x = 2; > } > assert(arr.at(7) == 2, "this holds"); > // but this was forbidden > int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& > // so we had to do > int x = arr.at_grow(9, -1); > if (x == -1) { > arr.at_put(9, 2); > } > > > Thanks. Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Small mistakes - FIXED! ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18975/files - new: https://git.openjdk.org/jdk/pull/18975/files/3774d481..b9431198 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18975/head:pull/18975 PR: https://git.openjdk.org/jdk/pull/18975 From dnsimon at openjdk.org Fri Apr 26 16:53:05 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 26 Apr 2024 16:53:05 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails Message-ID: This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries. The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here: V [jvm.dll+0x62c23a] Exceptions::_throw+0x11a (exceptions.cpp:168) V [jvm.dll+0x62d85b] Exceptions::_throw_oop+0xab (exceptions.cpp:140) V [jvm.dll+0xbbce78] MemAllocator::Allocation::check_out_of_memory+0x208 (memAllocator.cpp:138) V [jvm.dll+0xbbcac8] MemAllocator::allocate+0x158 (memAllocator.cpp:377) V [jvm.dll+0x79bd05] InstanceKlass::allocate_instance+0x95 (instanceKlass.cpp:1509) V [jvm.dll+0x7ddeed] java_lang_String::basic_create+0x9d (javaClasses.cpp:273) V [jvm.dll+0x7e43c0] java_lang_String::create_from_unicode+0x60 (javaClasses.cpp:291) V [jvm.dll+0xdb91a5] StringTable::do_intern+0xb5 (stringTable.cpp:379) V [jvm.dll+0xdba9f2] StringTable::intern+0x1b2 (stringTable.cpp:368) V [jvm.dll+0xdbaaa6] StringTable::intern+0x86 (stringTable.cpp:328) V [jvm.dll+0x51c8b1] ConstantPool::string_at_impl+0x1d1 (constantPool.cpp:1251) V [jvm.dll+0x51b95b] ConstantPool::resolve_string_constants_impl+0xeb (constantPool.cpp:800) V [jvm.dll+0x4f2f8d] CompileBroker::compile_method+0x31d (compileBroker.cpp:1395) V [jvm.dll+0x4f3474] CompileBroker::compile_method+0xc4 (compileBroker.cpp:1348) These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722) that have no stack trace. This PR mitigates this by introducing a scope (see [`SandboxedOOMEMark`](https://github.com/openjdk/jdk/pull/18925/files#diff-8656914dbf640348409a4acb38eb0cc179a93a74d049d4fe62d79d252d13cdacR124)) in which a failed heap allocation results in the shared, stacktrace-less OOME instance being thrown. This scope is used for guarding VM internal allocations where an OOME will not be propagated to user code. In addition, JVMTI "resource exhausted" events are disabled in the scope of an `SandboxedOOMEMark`. ------------- Commit messages: - unconditionally disable JVMTI resource exhausted events in scope of a SandboxedOOMEMark - only support disabling JVMTI events in SandboxedOOMEMark scope - generalized RetryableAllocationMark to SandboxedOOMEMark Changes: https://git.openjdk.org/jdk/pull/18925/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18925&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331208 Stats: 113 lines in 11 files changed: 54 ins; 40 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/18925.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18925/head:pull/18925 PR: https://git.openjdk.org/jdk/pull/18925 From never at openjdk.org Fri Apr 26 16:53:06 2024 From: never at openjdk.org (Tom Rodriguez) Date: Fri, 26 Apr 2024 16:53:06 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails In-Reply-To: References: Message-ID: <7ne3IDyUkDvA3YF2r_MOk5WZ3RK5z90_x_cfUs_3_to=.f1b1c33a-4a45-4e1f-a978-55a0c7ef9d39@github.com> On Tue, 23 Apr 2024 21:11:53 GMT, Doug Simon wrote: > This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries. > > The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here: > > V [jvm.dll+0x62c23a] Exceptions::_throw+0x11a (exceptions.cpp:168) > V [jvm.dll+0x62d85b] Exceptions::_throw_oop+0xab (exceptions.cpp:140) > V [jvm.dll+0xbbce78] MemAllocator::Allocation::check_out_of_memory+0x208 (memAllocator.cpp:138) > V [jvm.dll+0xbbcac8] MemAllocator::allocate+0x158 (memAllocator.cpp:377) > V [jvm.dll+0x79bd05] InstanceKlass::allocate_instance+0x95 (instanceKlass.cpp:1509) > V [jvm.dll+0x7ddeed] java_lang_String::basic_create+0x9d (javaClasses.cpp:273) > V [jvm.dll+0x7e43c0] java_lang_String::create_from_unicode+0x60 (javaClasses.cpp:291) > V [jvm.dll+0xdb91a5] StringTable::do_intern+0xb5 (stringTable.cpp:379) > V [jvm.dll+0xdba9f2] StringTable::intern+0x1b2 (stringTable.cpp:368) > V [jvm.dll+0xdbaaa6] StringTable::intern+0x86 (stringTable.cpp:328) > V [jvm.dll+0x51c8b1] ConstantPool::string_at_impl+0x1d1 (constantPool.cpp:1251) > V [jvm.dll+0x51b95b] ConstantPool::resolve_string_constants_impl+0xeb (constantPool.cpp:800) > V [jvm.dll+0x4f2f8d] CompileBroker::compile_method+0x31d (compileBroker.cpp:1395) > V [jvm.dll+0x4f3474] CompileBroker::compile_method+0xc4 (compileBroker.cpp:1348) > > These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/... src/hotspot/share/gc/shared/memAllocator.hpp line 121: > 119: }; > 120: > 121: // Manages a scope where a failed heap allocation results in I think you need to document why disable_events exists as well, mainly why you would pick one or the other value. Or you might hide from the default constructor and only make it available to the subclass. It's not really something we expect normal users of SandboxedOOMEMark to set to true. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1579966101 From dnsimon at openjdk.org Fri Apr 26 16:53:07 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 26 Apr 2024 16:53:07 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails In-Reply-To: <7ne3IDyUkDvA3YF2r_MOk5WZ3RK5z90_x_cfUs_3_to=.f1b1c33a-4a45-4e1f-a978-55a0c7ef9d39@github.com> References: <7ne3IDyUkDvA3YF2r_MOk5WZ3RK5z90_x_cfUs_3_to=.f1b1c33a-4a45-4e1f-a978-55a0c7ef9d39@github.com> Message-ID: On Thu, 25 Apr 2024 18:34:15 GMT, Tom Rodriguez wrote: >> This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries. >> >> The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here: >> >> V [jvm.dll+0x62c23a] Exceptions::_throw+0x11a (exceptions.cpp:168) >> V [jvm.dll+0x62d85b] Exceptions::_throw_oop+0xab (exceptions.cpp:140) >> V [jvm.dll+0xbbce78] MemAllocator::Allocation::check_out_of_memory+0x208 (memAllocator.cpp:138) >> V [jvm.dll+0xbbcac8] MemAllocator::allocate+0x158 (memAllocator.cpp:377) >> V [jvm.dll+0x79bd05] InstanceKlass::allocate_instance+0x95 (instanceKlass.cpp:1509) >> V [jvm.dll+0x7ddeed] java_lang_String::basic_create+0x9d (javaClasses.cpp:273) >> V [jvm.dll+0x7e43c0] java_lang_String::create_from_unicode+0x60 (javaClasses.cpp:291) >> V [jvm.dll+0xdb91a5] StringTable::do_intern+0xb5 (stringTable.cpp:379) >> V [jvm.dll+0xdba9f2] StringTable::intern+0x1b2 (stringTable.cpp:368) >> V [jvm.dll+0xdbaaa6] StringTable::intern+0x86 (stringTable.cpp:328) >> V [jvm.dll+0x51c8b1] ConstantPool::string_at_impl+0x1d1 (constantPool.cpp:1251) >> V [jvm.dll+0x51b95b] ConstantPool::resolve_string_constants_impl+0xeb (constantPool.cpp:800) >> V [jvm.dll+0x4f2f8d] CompileBroker::compile_method+0x31d (compileBroker.cpp:1395) >> V [jvm.dll+0x4f3474] CompileBroker::compile_method+0xc4 (compileBroker.cpp:1348) >> >> These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ec... > > src/hotspot/share/gc/shared/memAllocator.hpp line 121: > >> 119: }; >> 120: >> 121: // Manages a scope where a failed heap allocation results in > > I think you need to document why disable_events exists as well, mainly why you would pick one or the other value. Or you might hide from the default constructor and only make it available to the subclass. It's not really something we expect normal users of SandboxedOOMEMark to set to true. I've removed `disable_events` altogether. For OOMEs that will not propagate to user code, it also doesn't seem like they should be reported to JVMTI. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1580124484 From dnsimon at openjdk.org Fri Apr 26 16:53:08 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 26 Apr 2024 16:53:08 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 21:11:53 GMT, Doug Simon wrote: > This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries. > > The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here: > > V [jvm.dll+0x62c23a] Exceptions::_throw+0x11a (exceptions.cpp:168) > V [jvm.dll+0x62d85b] Exceptions::_throw_oop+0xab (exceptions.cpp:140) > V [jvm.dll+0xbbce78] MemAllocator::Allocation::check_out_of_memory+0x208 (memAllocator.cpp:138) > V [jvm.dll+0xbbcac8] MemAllocator::allocate+0x158 (memAllocator.cpp:377) > V [jvm.dll+0x79bd05] InstanceKlass::allocate_instance+0x95 (instanceKlass.cpp:1509) > V [jvm.dll+0x7ddeed] java_lang_String::basic_create+0x9d (javaClasses.cpp:273) > V [jvm.dll+0x7e43c0] java_lang_String::create_from_unicode+0x60 (javaClasses.cpp:291) > V [jvm.dll+0xdb91a5] StringTable::do_intern+0xb5 (stringTable.cpp:379) > V [jvm.dll+0xdba9f2] StringTable::intern+0x1b2 (stringTable.cpp:368) > V [jvm.dll+0xdbaaa6] StringTable::intern+0x86 (stringTable.cpp:328) > V [jvm.dll+0x51c8b1] ConstantPool::string_at_impl+0x1d1 (constantPool.cpp:1251) > V [jvm.dll+0x51b95b] ConstantPool::resolve_string_constants_impl+0xeb (constantPool.cpp:800) > V [jvm.dll+0x4f2f8d] CompileBroker::compile_method+0x31d (compileBroker.cpp:1395) > V [jvm.dll+0x4f3474] CompileBroker::compile_method+0xc4 (compileBroker.cpp:1348) > > These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/... src/hotspot/share/gc/shared/memAllocator.hpp line 133: > 131: SandboxedOOMEMark(JavaThread* thread, bool disable_events=false) { > 132: if (thread != nullptr) { > 133: _outer = thread->sandboxed_oome_mark(); Need for supporting recursion is shown by this stack trace: V [libjvm.dylib+0x4c9fe4] CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, DirectiveSet*, JavaThread*)+0x6b0 V [libjvm.dylib+0x4c98d0] CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+0xcc V [libjvm.dylib+0x4a7434] CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, nmethod*, JavaThread*)+0x2e0 V [libjvm.dylib+0x355c14] Runtime1::counter_overflow(JavaThread*, int, Method*)+0x268 v ~RuntimeStub::counter_overflow Runtime1 stub 0x0000000116276c3c J 4004 c1 jdk.internal.loader.URLClassPath.getLoader(I)Ljdk/internal/loader/URLClassPath$Loader; java.base at 23-internal (194 bytes) @ 0x000000010f55a7bc [0x000000010f558cc0+0x0000000000001afc] J 3651 jvmci jdk.internal.loader.URLClassPath.getResource(Ljava/lang/String;Z)Ljdk/internal/loader/Resource; java.base at 23-internal (74 bytes) @ 0x0000000116ba628c [0x0000000116ba6200+0x000000000000008c] J 3649 jvmci jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(Ljava/lang/String;)Ljava/lang/Class; java.base at 23-internal (64 bytes) @ 0x0000000116ba4ffc [0x0000000116ba4c40+0x00000000000003bc] J 3640 jvmci jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(Ljava/lang/String;Z)Ljava/lang/Class; java.base at 23-internal (143 bytes) @ 0x0000000116ba26c0 [0x0000000116ba2440+0x0000000000000280] J 3638 jvmci jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; java.base at 23-internal (40 bytes) @ 0x0000000116ba17c0 [0x0000000116ba1680+0x0000000000000140] J 3636 jvmci java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; java.base at 23-internal (7 bytes) @ 0x0000000116ba137c [0x0000000116ba1300+0x000000000000007c] v ~StubRoutines::call_stub 0x00000001160f0190 V [libjvm.dylib+0x856918] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x420 V [libjvm.dylib+0x855618] JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, JavaThread*)+0x218 V [libjvm.dylib+0x8558b0] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Handle, JavaThread*)+0x70 V [libjvm.dylib+0x1024ca8] SystemDictionary::load_instance_class_impl(Symbol*, Handle, JavaThread*)+0x114 V [libjvm.dylib+0x10226c8] SystemDictionary::load_instance_class(Symbol*, Handle, JavaThread*)+0x28 V [libjvm.dylib+0x1021a7c] SystemDictionary::resolve_instance_class_or_null(Symbol*, Handle, Handle, JavaThread*)+0x69c V [libjvm.dylib+0xf5ea64] SignatureStream::as_klass(Handle, Handle, SignatureStream::FailureMode, JavaThread*)+0x60 V [libjvm.dylib+0xd57894] Method::load_signature_classes(methodHandle const&, JavaThread*)+0xf0 V [libjvm.dylib+0x4c9c8c] CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, DirectiveSet*, JavaThread*)+0x358 V [libjvm.dylib+0x4c98d0] CompileBroker::compile_method(methodHandle const&, int, int, methodHandle const&, int, CompileTask::CompileReason, JavaThread*)+0xcc V [libjvm.dylib+0x4a7434] CompilationPolicy::event(methodHandle const&, methodHandle const&, int, int, CompLevel, nmethod*, JavaThread*)+0x2e0 V [libjvm.dylib+0x355c14] Runtime1::counter_overflow(JavaThread*, int, Method*)+0x268 v ~RuntimeStub::counter_overflow Runtime1 stub 0x0000000116276c3c J 4003 c1 CountUppercase.identity(ILCountUppercase$Unloaded;)I (2 bytes) @ 0x000000010f558874 [0x000000010f5587c0+0x00000000000000b4] j CountUppercase.main([Ljava/lang/String;)V+61 v ~StubRoutines::call_stub 0x00000001160f0190 V [libjvm.dylib+0x856918] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x420 V [libjvm.dylib+0x940bf8] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, JavaThread*)+0x14c V [libjvm.dylib+0x9475c0] jni_CallStaticVoidMethod+0x16c C [libjli.dylib+0xa260] invokeStaticMainWithArgs+0x84 C [libjli.dylib+0xaa9c] JavaMain+0x588 C [libjli.dylib+0xd4a0] ThreadJavaMain+0xc Note the recursive call to `CompileBroker::compile_method` (which uses a `SandboxedOOMEMark`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1579726535 From aph at openjdk.org Fri Apr 26 17:39:52 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 26 Apr 2024 17:39:52 GMT Subject: RFR: 8331098: [Aarch64] Fix crash in Arrays.equals() intrinsic with -CCP [v2] In-Reply-To: References: <_HzINQ0atD5BmBbIZ6A4A5y1wNvwsvrBxAiaz2Mk9rY=.43cde0ae-1179-4708-afa1-fda64039d722@github.com> <43G3lM1SyM9s-2uC4qO2gDRwPNoms82BK4NIocUTWvQ=.de9e2aa0-b277-426e-971c-33d7684fb1ae@github.com> <5o7rt_EGNySfdgjVjZ9ny4DxQ0fctvYUD_rUkJWEYA8=.b26c137a-1c6c-4ff9-8a5a-1e185cdcfba9@github.com> Message-ID: On Fri, 26 Apr 2024 13:18:46 GMT, Roman Kennke wrote: > This is not about an optimization, but about a correctness issue. The loop(s) have been written under the assumption that they can read full words, which is true if we start at a word boundary. However, if we don't, then we can attempt an (unaligned) read beyond the array, and if that memory is outside of the heap and unmapped, then we would crash. Note that this currently only happens when running with -UseCompressedClassPointers which almost nobody does. We encountered it with Lilliput, which changes array layout in a similar way. OK, I see. Fair enough. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18948#discussion_r1581333550 From gziemski at openjdk.org Fri Apr 26 17:46:03 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Fri, 26 Apr 2024 17:46:03 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> Message-ID: On Thu, 25 Apr 2024 10:43:22 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/memTracker.cpp line 71: >> >>> 69: if (!MallocTracker::initialize(level) || >>> 70: !VirtualMemoryTracker::initialize(level) || >>> 71: !MemoryFileTracker::Instance::initialize(level) || >> >> Is there a way to hide the `instance` so that we could do: >> >> `!MemoryFileTracker::initialize(level) >> ` >> >> not >> >> `!MemoryFileTracker::Instance::initialize(level)` >> >> just like the other calls here? The instance is not needed here and just an implementation detail. > > We could invert the relationship such that the outer class is the `AllStatic` class and the inner class is the allocatable class. I'll look into it at a later stage as it's all a big renaming. > > Personally, I don't mind the `::Instance` nomenclature to indicate that "this is the global instance that we're accessing". As long as we keep away from static, global singletons that we can't make many instances like VirtualMemoryTracker and MallocTracker are written, I'm a happy goose. I agree with you actually, but the `Instance` jumps out at me and makes me wonder why we decided to use it, compared to the others, that are happy to be static classes. We should have it all done same way. If you like to use `Instance`, then `VirtualMemoryTracker` and `MallocTracker` should use one as well (at some point later). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1581338808 From cjplummer at openjdk.org Fri Apr 26 20:03:50 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 26 Apr 2024 20:03:50 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v2] In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 07:45:50 GMT, Serguei Spitsyn wrote: >> This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. >> For performance analysis, experiments and numbers, please, see the comment below this description. >> >> The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. >> >> Testing: >> - Tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: fixed minor issues: renamed function, corrected comment, removed typo in assert src/hotspot/share/prims/jvmtiThreadState.cpp line 366: > 364: attempts--; > 365: } > 366: DEBUG_ONLY(if (attempts == 0) break;) Previously `_VTMS_transition_count` considered all threads at the same time. Now you are iterating through the threads and looking at a flag in each one. Is it guaranteed that once the `_VTMS_transition_mark` flag has been verified not to be set in a thread it won't get set while still iterating in the threads loop? src/hotspot/share/prims/jvmtiThreadState.cpp line 433: > 431: // Avoid using MonitorLocker on performance critical path, use > 432: // two-level synchronization with lock-free operations on counters. > 433: assert(!thread->VTMS_transition_mark(), "sanity check"); The "counters" comment needs to be updated. src/hotspot/share/prims/jvmtiThreadState.cpp line 456: > 454: // Slow path: undo unsuccessful optimistic counter incrementation. > 455: // It can cause an extra waiting cycle for VTMS transition disablers. > 456: thread->set_VTMS_transition_mark(false); The "optimistic counter incrementation" comment needs updating. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1581460754 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1581463641 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1581462878 From kvn at openjdk.org Fri Apr 26 21:20:55 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 26 Apr 2024 21:20:55 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache Message-ID: Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. Tested tier1-5, stress,xcomp ------------- Commit messages: - 8331087: Move immutable nmethod data from CodeCache Changes: https://git.openjdk.org/jdk/pull/18984/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18984&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331087 Stats: 290 lines in 7 files changed: 149 ins; 31 del; 110 mod Patch: https://git.openjdk.org/jdk/pull/18984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18984/head:pull/18984 PR: https://git.openjdk.org/jdk/pull/18984 From kvn at openjdk.org Fri Apr 26 21:47:58 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 26 Apr 2024 21:47:58 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp src/hotspot/share/code/nmethod.cpp line 117: > 115: result = static_cast(thing); \ > 116: assert(static_cast(result) == thing, "failed: %d != %d", static_cast(result), thing); > 117: I replaced `checked_cast<>()` with this macro because of next issues: - The existing assert points to `utilities/checkedCast.hpp` file where this method is located and not where failed cast. It does not help when it is used several times in one method (for example, in `nmethod()` constructors). - The existing assert does not print values src/hotspot/share/code/nmethod.cpp line 1324: > 1322: > 1323: // native wrapper does not have read-only data but we need unique not null address > 1324: _immutable_data = data_end(); I can't use nullptr because VM expects not null address when it checks, for example, `dependencies_begin()` even so sizes are 0. I used `data_end()` instead of nullptr in other places too. src/hotspot/share/code/nmethod.hpp line 583: > 581: int dependencies_size () const { return int( dependencies_end () - dependencies_begin ()); } > 582: int handler_table_size () const { return int( handler_table_end() - handler_table_begin()); } > 583: int nul_chk_table_size () const { return int( nul_chk_table_end() - nul_chk_table_begin()); } Shift by one space to aline code. test/hotspot/jtreg/compiler/c1/TestLinearScanOrderMain.java line 29: > 27: * @compile TestLinearScanOrder.jasm > 28: * @run main/othervm -Xcomp -XX:+TieredCompilation -XX:TieredStopAtLevel=1 > 29: * -XX:+IgnoreUnrecognizedVMOptions -XX:NMethodSizeLimit=655360 This test caught one `check_cast<>` issue during development but only on aarch64. On x64 the test bailed out compilation before that because default `NMethodSizeLimit` was not big enough ((64*K)*wordSize = 524288). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581561933 PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581565762 PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581558637 PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581557756 From amenkov at openjdk.org Fri Apr 26 23:06:23 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 26 Apr 2024 23:06:23 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly Message-ID: Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method Testing: tier1-6 ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/18986/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18986&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8330852 Stats: 42 lines in 3 files changed: 3 ins; 10 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/18986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18986/head:pull/18986 PR: https://git.openjdk.org/jdk/pull/18986 From cjplummer at openjdk.org Fri Apr 26 23:41:06 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 26 Apr 2024 23:41:06 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 22:59:43 GMT, Alex Menkov wrote: > Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method > > Testing: tier1-6 src/hotspot/share/prims/jvmtiEnvBase.cpp line 1976: > 1974: oop thread_obj = nullptr; > 1975: > 1976: jvmtiError err = JvmtiEnvBase::get_threadOop_and_JavaThread(tlh.list(), target, current, &java_thread, &thread_obj); I think a good cleanup would be to also replace `current` with `current_thread`, although I'm not sure how common each are. I see 3 `current` references in this webrev. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18986#discussion_r1581610079 From amenkov at openjdk.org Sat Apr 27 00:04:05 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Sat, 27 Apr 2024 00:04:05 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 23:36:41 GMT, Chris Plummer wrote: >> Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method >> >> Testing: tier1-6 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1976: > >> 1974: oop thread_obj = nullptr; >> 1975: >> 1976: jvmtiError err = JvmtiEnvBase::get_threadOop_and_JavaThread(tlh.list(), target, current, &java_thread, &thread_obj); > > I think a good cleanup would be to also replace `current` with `current_thread`, although I'm not sure how common each are. I see 3 `current` references in this webrev. Looks like in JVMTI `current_thread` is more common (and `current` is usually used in runtime :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18986#discussion_r1581628063 From dlong at openjdk.org Sat Apr 27 00:05:12 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 27 Apr 2024 00:05:12 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: <5wDjPMjtjkFLL6V4UtPQMeLLlIZWyPGmoHf20NvHLRc=.649977ed-329d-4355-8188-71c8d5c7c14f@github.com> On Fri, 26 Apr 2024 21:36:50 GMT, Vladimir Kozlov wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > src/hotspot/share/code/nmethod.cpp line 117: > >> 115: result = static_cast(thing); \ >> 116: assert(static_cast(result) == thing, "failed: %d != %d", static_cast(result), thing); >> 117: > > I replaced `checked_cast<>()` with this macro because of next issues: > - The existing assert points to `utilities/checkedCast.hpp` file where this method is located and not where failed cast. It does not help when it is used several times in one method (for example, in `nmethod()` constructors). > - The existing assert does not print values I thought @kimbarrett had a draft PR to address the error reporting issue, but I can't seem to find it. To solve the general problem, I think we need a version of vmassert() that takes `char* file, int lineno` as arguments, and a macro wrapper for checked_cast() that passes `__FILE__` and `__LINEN__` from the caller. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581628786 From kvn at openjdk.org Sat Apr 27 00:45:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 27 Apr 2024 00:45:10 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: <5wDjPMjtjkFLL6V4UtPQMeLLlIZWyPGmoHf20NvHLRc=.649977ed-329d-4355-8188-71c8d5c7c14f@github.com> References: <5wDjPMjtjkFLL6V4UtPQMeLLlIZWyPGmoHf20NvHLRc=.649977ed-329d-4355-8188-71c8d5c7c14f@github.com> Message-ID: On Sat, 27 Apr 2024 00:02:16 GMT, Dean Long wrote: >> src/hotspot/share/code/nmethod.cpp line 117: >> >>> 115: result = static_cast(thing); \ >>> 116: assert(static_cast(result) == thing, "failed: %d != %d", static_cast(result), thing); >>> 117: >> >> I replaced `checked_cast<>()` with this macro because of next issues: >> - The existing assert points to `utilities/checkedCast.hpp` file where this method is located and not where failed cast. It does not help when it is used several times in one method (for example, in `nmethod()` constructors). >> - The existing assert does not print values > > I thought @kimbarrett had a draft PR to address the error reporting issue, but I can't seem to find it. To solve the general problem, I think we need a version of vmassert() that takes `char* file, int lineno` as arguments, and a macro wrapper for checked_cast() that passes `__FILE__` and `__LINEN__` from the caller. Yes, it would be perfect separate RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581641784 From dlong at openjdk.org Sat Apr 27 00:51:14 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 27 Apr 2024 00:51:14 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. src/hotspot/share/code/nmethod.cpp line 1332: > 1330: #if INCLUDE_JVMCI > 1331: _speculations_offset = _scopes_data_offset; > 1332: _jvmci_data_offset = _speculations_offset; Why not use 0 for all these? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581642931 From dlong at openjdk.org Sat Apr 27 00:56:04 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 27 Apr 2024 00:56:04 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. src/hotspot/share/code/nmethod.cpp line 1484: > 1482: // Calculate positive offset as distance between the start of stubs section > 1483: // (which is also the end of instructions section) and the start of the handler. > 1484: CHECKED_CAST(_unwind_handler_offset, int16_t, (_stub_offset - code_offset() - offsets->value(CodeOffsets::UnwindHandler))); Suggestion: int unwind_handler_offset = code_offset() + offsets->value(CodeOffsets::UnwindHandler); CHECKED_CAST(_unwind_handler_offset, int16_t, _stub_offset - unwind_handler_offset); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581644356 From kvn at openjdk.org Sat Apr 27 01:15:05 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 27 Apr 2024 01:15:05 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Sat, 27 Apr 2024 00:48:49 GMT, Dean Long wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > src/hotspot/share/code/nmethod.cpp line 1332: > >> 1330: #if INCLUDE_JVMCI >> 1331: _speculations_offset = _scopes_data_offset; >> 1332: _jvmci_data_offset = _speculations_offset; > > Why not use 0 for all these? Right. Before all these offsets were assigned to _dependencies_offset which was not 0. But now we can use 0 for all of them since "_dependencies_offset" is 0. > src/hotspot/share/code/nmethod.cpp line 1484: > >> 1482: // Calculate positive offset as distance between the start of stubs section >> 1483: // (which is also the end of instructions section) and the start of the handler. >> 1484: CHECKED_CAST(_unwind_handler_offset, int16_t, (_stub_offset - code_offset() - offsets->value(CodeOffsets::UnwindHandler))); > > Suggestion: > > int unwind_handler_offset = code_offset() + offsets->value(CodeOffsets::UnwindHandler); > CHECKED_CAST(_unwind_handler_offset, int16_t, _stub_offset - unwind_handler_offset); Will do. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/code/NMethod.java line 528: > >> 526: private int getScopesDataOffset() { return (int) scopesDataOffsetField .getValue(addr); } >> 527: private int getScopesPCsOffset() { return (int) scopesPCsOffsetField .getValue(addr); } >> 528: private int getDependenciesOffset() { return (int) 0; } > > Suggestion: > > > No longer used. Will remove. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581667051 PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581668080 PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581679317 From dlong at openjdk.org Sat Apr 27 01:15:05 2024 From: dlong at openjdk.org (Dean Long) Date: Sat, 27 Apr 2024 01:15:05 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/code/NMethod.java line 528: > 526: private int getScopesDataOffset() { return (int) scopesDataOffsetField .getValue(addr); } > 527: private int getScopesPCsOffset() { return (int) scopesPCsOffsetField .getValue(addr); } > 528: private int getDependenciesOffset() { return (int) 0; } Suggestion: No longer used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581671710 From kvn at openjdk.org Sat Apr 27 01:15:03 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 27 Apr 2024 01:15:03 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. Thank you, @dean-long, for review. I will collect (hopefully) more comments for next update before testing and pushing it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2080291257 From duke at openjdk.org Sat Apr 27 02:34:21 2024 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 27 Apr 2024 02:34:21 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v4] In-Reply-To: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: > follow up 8267941 Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: fix indentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18871/files - new: https://git.openjdk.org/jdk/pull/18871/files/f02334fd..a76a71de Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18871&range=02-03 Stats: 34 lines in 8 files changed: 0 ins; 2 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/18871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18871/head:pull/18871 PR: https://git.openjdk.org/jdk/pull/18871 From stuefe at openjdk.org Sat Apr 27 05:10:04 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 27 Apr 2024 05:10:04 GMT Subject: RFR: 8330275: Crash in XMark::follow_array [v3] In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 14:20:49 GMT, Ashutosh Mehra wrote: > I agree from the point of view of backporting, point-fix is all we need in this PR. > > @tstuefe As for the other platforms (riscv and ppc), looking at their code they seem to be broken in the same way as aarch64 but then the problem only happens if the user runs with > 1TB heap size with more than 48 addressable bits. Again, in the spirit of "do not touch if it is not broken", I am fine if we restrict the change to just aarch64. > > @tstuefe @stefank please let me know if you agree with just doing the point-fix to aarch64. Absolutely. We can do any platform testing on other platforms and cleanups in subsequent RFEs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18941#issuecomment-2080365924 From stuefe at openjdk.org Sat Apr 27 05:59:08 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 27 Apr 2024 05:59:08 GMT Subject: RFR: 8324776: runtime/os/TestTransparentHugePageUsage.java fails with The usage of THP is not enough In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 08:57:48 GMT, Liming Liu wrote: > This PR remove the testcase introduced in JDK-8315923, as we could not find a reliable way to evaluate the usage of THP. We have tried the following methods: > > 1. transverse /proc/self/smaps rather than looking up the first map covered by the heap, as we found there can be multiple sections in /proc/self/smaps for the heap; (https://github.com/limingliu-ampere/jdk/commit/c5b0c4cdf9fa42988faa9fee6ee004ebb599d40a) > 2. take the mode of de-fragment and the enabling of khugepaged into account rather than just THP mode, as THP may not be available immediately when the de-fragment mode is neither "always" nor "madvise", or khugepaged does not collapse pages; (https://github.com/limingliu-ampere/jdk/commit/9c70e9384325b44e074a9e8973846343b27fd2cc) > 3. call madvise with MADV_HUGEPAGE unconditionally rather than calling it only when THP mode is not "always", and adjust the sizes of young and old generations to ensure the parameters are aligned with THP; (https://github.com/limingliu-ampere/jdk/commit/de9607ff64cc526bca9968b72a7065888c2f944d) > 4. check the changes of system-wide counters like thp_* in /proc/vmstat before and after pretouch via gtest. (https://github.com/limingliu-ampere/jdk/commit/bc83e19a682156ee7d09bf939c2b18f3d8c79e22) > > But none of them helps. The amount of THP keeps zero on Oracle CI, although the THP mode is "always", the de-fragment mode is "madvise" and khugepaged is enabled. Furthermore, none of thp counters changed around pretouch. However, we tried the same kernel (5.15-UEK) as Oracle CI on our machine, and found that these methods do help. Thus, we decided to remove this testcase. I am fine with removing the test case. There is a point of diminishing returns, and you did your due diligence here. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18792#pullrequestreview-2026357511 From gziemski at openjdk.org Sat Apr 27 13:16:14 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Sat, 27 Apr 2024 13:16:14 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v49] In-Reply-To: References: <1cKD_eCdTb8AmNQwA9T4GFK0xu_CjJeABePgatn8xSY=.ec58f99d-bcd6-4e92-87a4-d1e49d33f4af@github.com> Message-ID: <7dLzx1ziOv1Qo2vfr8hAh9JRxas2TBvp6Zjvw206KRA=.59863075-5473-4de3-8d08-6f89817e4f8c@github.com> On Thu, 25 Apr 2024 10:34:50 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/memTracker.hpp line 172: >> >>> 170: static inline MemoryFileTracker::MemoryFile* register_device(const char* descriptive_name) { >>> 171: assert_post_init(); >>> 172: if (!enabled()) return nullptr; >> >> Could we push `assert_post_init()` into `enabled()` ? > > That's a discussion that should take place in its own PR. I see 16 instances of the same patter: assert_post_init(); if (!enabled()) return `in memTracker.hpp` so it's local and isolated to `MemTracker` class. I didn't think this would be controversial/big deal to warrant its own PR? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1581829130 From dnsimon at openjdk.org Sat Apr 27 20:51:14 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 27 Apr 2024 20:51:14 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. src/hotspot/share/code/nmethod.hpp line 476: > 474: passed, > 475: code_cache_full, > 476: out_of_memory Maybe `out_of_c_heap_memory` would be clearer? Or `out_of_immutable_data_memory` if immutable data may not always be malloc'ed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581904919 From dnsimon at openjdk.org Sat Apr 27 21:00:05 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 27 Apr 2024 21:00:05 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: <-EcNmtS6moX1Bx6ZDQfe46MtUGvk-qEiLTeX0O5qOyE=.2c839c6a-dc79-4851-8b32-230bfc66661f@github.com> On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. src/hotspot/share/jvmci/jvmciRuntime.cpp line 2178: > 2176: nmethod_mirror_name, > 2177: failed_speculations); > 2178: nmethod::ResultStatus result_status; Please propagate the new `out_of_memory` result throughout JVMCI (e.g. in `JVMCI::CodeInstallResult` enum and `HotSpotVMConfig.getCodeInstallResultDescription` method). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581906300 From kvn at openjdk.org Sun Apr 28 02:28:10 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 28 Apr 2024 02:28:10 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Sat, 27 Apr 2024 20:48:38 GMT, Doug Simon wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > src/hotspot/share/code/nmethod.hpp line 476: > >> 474: passed, >> 475: code_cache_full, >> 476: out_of_memory > > Maybe `out_of_c_heap_memory` would be clearer? Or `out_of_immutable_data_memory` if immutable data may not always be malloc'ed. May be `no_space_for_immutable_data`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18984#discussion_r1581998799 From kvn at openjdk.org Sun Apr 28 02:36:09 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 28 Apr 2024 02:36:09 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: <6wgRfo3fvRqgQGGZ-j_NzKm0g8KGFxqC8ZLB1-IHPsw=.b7edb4ff-e87c-48cf-b8c9-f7753f586f9f@github.com> On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. @dean-long and @dougxc I am thinking may be I should not bailout when `malloc` (or other space reservation in a future) failed to allocate memory for immutable data. But instead increase nmethod size and put immutable data there (as before). Then we bailout only when CodeCache is full as before and we don't need `out_of_memory` failure reason. May be only record that in logs (when they are enabled). What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2081297482 From dlong at openjdk.org Sun Apr 28 07:05:05 2024 From: dlong at openjdk.org (Dean Long) Date: Sun, 28 Apr 2024 07:05:05 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. It only makes sense if the immutable data heap is not also used for other critical resources. If malloc or metaspace were used as the immutable data heap, normally failures in those heaps are fatal, because other critical resources (monitors, classes, etc) are allocated from there, so any failure means the JVM is about to die. There's no reason to find a fall-back method to allocate a new nmethod in that case. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2081364009 From duke at openjdk.org Sun Apr 28 08:32:16 2024 From: duke at openjdk.org (jjscl8888) Date: Sun, 28 Apr 2024 08:32:16 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 10:08:14 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that changes the filler array class name (again) after user feedback. > > In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. > This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. > > Testing: tier1-6 > > Thanks, > Thomas I observed a phenomenon on our application. When there is no traffic on a certain instance, the number of old generation objects suddenly increases at a certain moment. After dumping the object instances, I found a large number of jdk.internal.vm.FillerArray objects, occupying more than 10G of memory. Have you ever encountered this? ![image](https://github.com/openjdk/jdk/assets/32790117/c515dc81-ce21-4fdd-b132-c1723bbadc73) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-2081389482 From dnsimon at openjdk.org Sun Apr 28 10:35:05 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Sun, 28 Apr 2024 10:35:05 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 07:02:40 GMT, Dean Long wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > It only makes sense if the immutable data heap is not also used for other critical resources. If malloc or metaspace were used as the immutable data heap, normally failures in those heaps are fatal, because other critical resources (monitors, classes, etc) are allocated from there, so any failure means the JVM is about to die. There's no reason to find a fall-back method to allocate a new nmethod in that case. Just to be clear @dean-long , you're saying failure to allocate immutable data in the C heap should result in a fatal error? Makes sense to me as the VM must indeed be very close to crashing anyway in that case. It also, obviates the need for propagating `out_of_memory_error` to JVMCI code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2081427477 From kvn at openjdk.org Sun Apr 28 23:37:22 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 28 Apr 2024 23:37:22 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Address comments. Moved jvmci_data back to mutable data section. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18984/files - new: https://git.openjdk.org/jdk/pull/18984/files/6b1f69d9..1824f46c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18984&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18984&range=00-01 Stats: 98 lines in 5 files changed: 39 ins; 36 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/18984.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18984/head:pull/18984 PR: https://git.openjdk.org/jdk/pull/18984 From kvn at openjdk.org Sun Apr 28 23:45:08 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sun, 28 Apr 2024 23:45:08 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 23:37:22 GMT, Vladimir Kozlov wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations, jvmci_data`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Bail out compilation if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments. Moved jvmci_data back to mutable data section. Update: 1. Addressed @dean-long first comments. 2. Based on discussion with Doug and Tom (see comments in [JDK-8331087](https://bugs.openjdk.org/browse/JDK-8331087)), moved `jvmci_data` back to nmethod's mutable data section. 3. Replaced my allocation failure handling code with call to `vm_exit_out_of_memory()`. I verified (with `UseNewCode` hack`) that out of memory is correctly reported in fastdebug and product VMs: # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 64 bytes. Error detail: nmethod: no space for immutable data # An error report file with more information is saved as: # /scratch/kvn/jdk_git/hs_err_pid4086275.log ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2081701059 From duke at openjdk.org Mon Apr 29 02:58:17 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 29 Apr 2024 02:58:17 GMT Subject: RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7] In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 14:06:14 GMT, Andrew Haley wrote: >> kuaiwei has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix arm build error > > Argh, I found it. It happens because C2 calls `masm->offset()` from `PhaseOutput::fill_buffer()` after every node is emitted. So that trick isn't going to work. > > It was worth a try, but given that C2 expects offset() to be correct after every node, I think we're stuck. Maybe the last idea you had is the best possible without C2 tinkering. @theRealAph Could you help review this PR? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2081807917 From rehn at openjdk.org Mon Apr 29 06:33:05 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 29 Apr 2024 06:33:05 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: <0gMQgeYKyAzms64-hBIrltqUSfetu3Kczwr7IwLmF18=.8f583ac0-afff-4f1b-985f-a688cd898ae3@github.com> References: <1UZeWIQJIEYbPetxWPlhQffyAy4gWXvNiV79i4_3pMQ=.86fb3068-940b-49ea-a2ea-b84a865d4cca@github.com> <0gMQgeYKyAzms64-hBIrltqUSfetu3Kczwr7IwLmF18=.8f583ac0-afff-4f1b-985f-a688cd898ae3@github.com> Message-ID: On Fri, 26 Apr 2024 14:06:10 GMT, Fei Yang wrote: >> Old call(): >> >> int32_t offset = 0; >> mv(temp, dest, offset); // =>li(); >> jalr(x1, temp, offset); >> >> >> To keep the sites the same (for non-code-cache calls) >> New rt_call(): >> >> movptr(tmp, target.target(), offset); >> Assembler::jalr(x1, tmp, offset); >> >> >> Same here means absolute calls, no reloc required. >> So I have tried to keep the calls the same. As you say we can optimize this by using reloc + la(). > > Hi, Let me try to understand what you mean. Are we going to remove the `relocate` for non-code-cache call at [1] and further improve the `movptr` at [2] making use of `la`? So no need for `call` then as they could be replaced with `rt_call` then? This sounds interesting to me :- ) > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L5031 > [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L5033 I have not looked at it in detail. As mnemonic for **call** is _auipc_ + _jalr_, in hotspot `la()` + `jalr()`. So sites using call() for non-code-cache was changed to rt_call(), which gets us the same result as the old call(). Hence this patch 'tries' to keep the generate assembly the same. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1582591937 From thartmann at openjdk.org Mon Apr 29 06:36:04 2024 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 29 Apr 2024 06:36:04 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 23:37:22 GMT, Vladimir Kozlov wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Call `vm_exit_out_of_memory()` if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments. Moved jvmci_data back to mutable data section. Looks good to me. Did you measure any impact on performance (potentially due to improved code density)? What's left for [JDK-7072317](https://bugs.openjdk.org/browse/JDK-7072317)? I wonder if the `CHECKED_CAST` changes shouldn't go into a separate RFE. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18984#pullrequestreview-2027660174 From tschatzl at openjdk.org Mon Apr 29 07:50:11 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Apr 2024 07:50:11 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 08:25:34 GMT, jjscl8888 wrote: > I observed a phenomenon on our application. When there is no traffic on a certain instance, the number of old generation objects suddenly increases at a certain moment. After dumping the object instances, I found a large number of jdk.internal.vm.FillerArray objects, occupying more than 10G of memory. Have you ever encountered this? You probably did the heap dump right after G1 managed to free lots of memory - these `FillerArray` elements represent unused memory within regions. Previously one would have seen a huge amount of int-arrays (`[I`) staying around. Normally G1 would then incrementally reduce these `FillerArray`s in subsequent mixed collections (after marking etc) by evacuating the live objects between these filler objects, making these regions completely empty. >From the given output of `jmap` and not knowing what else is going on with the GC this behavior seems normal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-2082081516 From dlong at openjdk.org Mon Apr 29 07:51:09 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 29 Apr 2024 07:51:09 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 07:02:40 GMT, Dean Long wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Call `vm_exit_out_of_memory()` if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > It only makes sense if the immutable data heap is not also used for other critical resources. If malloc or metaspace were used as the immutable data heap, normally failures in those heaps are fatal, because other critical resources (monitors, classes, etc) are allocated from there, so any failure means the JVM is about to die. There's no reason to find a fall-back method to allocate a new nmethod in that case. > Just to be clear @dean-long , you're saying failure to allocate immutable data in the C heap should result in a fatal error? Makes sense to me as the VM must indeed be very close to crashing anyway in that case. It also, obviates the need for propagating `out_of_memory_error` to JVMCI code. I hadn't thought it through that far, actually. I was only pointing out that the proposed fall-back: > increase nmethod size and put immutable data there (as before). isn't worth the trouble. But making the C heap failure fatal immediately is reasonable, especially if it simplifies JVMCI error reporting. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2082083104 From dlong at openjdk.org Mon Apr 29 07:59:06 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 29 Apr 2024 07:59:06 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 23:37:22 GMT, Vladimir Kozlov wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Call `vm_exit_out_of_memory()` if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments. Moved jvmci_data back to mutable data section. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18984#pullrequestreview-2027786061 From tschatzl at openjdk.org Mon Apr 29 08:17:10 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 29 Apr 2024 08:17:10 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v4] In-Reply-To: References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: On Sat, 27 Apr 2024 02:34:21 GMT, Lei Zaakjyu wrote: >> follow up 8267941 > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > fix indentation mach5 higher tier SA tests are fine. What are your plans for the remaining SA renames (would highly recommend to add) and the G1HeapRegion related helper classes? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18871#issuecomment-2082124530 From shade at openjdk.org Mon Apr 29 08:18:51 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Apr 2024 08:18:51 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v6] In-Reply-To: References: Message-ID: <9r6p7oHNH_9hg_jzOxFh2lKDDS8a1hwTRIIDjfCWOeU=.a30a43cb-426b-4d66-93d6-bb55ab2a8445@github.com> > This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. > > Additional testing: > - [x] MacOS AArch64 server fastdebug build passes > - [ ] Linux x86_64 server fastdebug, `all` > - [ ] Linux x86_64 server fastdebug, 100K Fuzzer tests > - [ ] Linux x86_64 server fastdebug, Maven CTW > - [ ] Linux AArch64 server fastdebug, `all` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into JDK-8328934-abs-legal - Also tests - Drop the other check; dodge UB - More straightforward - Richer error reporting - Only assert integral type arguments - Need explicit include as well - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18751/files - new: https://git.openjdk.org/jdk/pull/18751/files/3f6d76f3..bc3bfe81 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18751&range=04-05 Stats: 74590 lines in 2141 files changed: 33127 ins; 33156 del; 8307 mod Patch: https://git.openjdk.org/jdk/pull/18751.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18751/head:pull/18751 PR: https://git.openjdk.org/jdk/pull/18751 From rkennke at openjdk.org Mon Apr 29 08:41:17 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 29 Apr 2024 08:41:17 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v4] In-Reply-To: References: Message-ID: On Thu, 8 Feb 2024 14:37:20 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers to indicate promotion failure. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the currently unused 3rd header bit (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. >> >> This is a trimmed-down/simplified version of the original proposal #13779: >> - It doesn't use/introduce any flags and avoids the associated branching. >> - It doesn't (need to) deal with displaced headers. (Current code would preserve header if necessary, Lilliput code would not use displaced headers and set the 3rd bit directly in existing header.) >> >> Testing: >> - [x] hotspot_gc >> - [x] tier1 >> - [x] tier2 > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > More consistent use of markWord::is_forwarded() Will re-open when ready in Lilliput ------------- PR Comment: https://git.openjdk.org/jdk/pull/17755#issuecomment-2082165356 From rkennke at openjdk.org Mon Apr 29 08:41:18 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 29 Apr 2024 08:41:18 GMT Subject: Withdrawn: 8305898: Alternative self-forwarding mechanism In-Reply-To: References: Message-ID: On Wed, 7 Feb 2024 18:33:00 GMT, Roman Kennke wrote: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers to indicate promotion failure. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the currently unused 3rd header bit (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > This is a trimmed-down/simplified version of the original proposal #13779: > - It doesn't use/introduce any flags and avoids the associated branching. > - It doesn't (need to) deal with displaced headers. (Current code would preserve header if necessary, Lilliput code would not use displaced headers and set the 3rd bit directly in existing header.) > > Testing: > - [x] hotspot_gc > - [x] tier1 > - [x] tier2 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/17755 From jsjolen at openjdk.org Mon Apr 29 08:46:16 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 08:46:16 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v5] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Sat, 13 Apr 2024 05:38:11 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> mtCode and mtMetaspace were missed from System Dump map > > Just a thought: one (manual) test I would do would be that several JVMs run with the same conditions (I would do at least one with -Xmx=Xms and AlwaysPreTouch) accumulate the same NMT numbers, current, and peak. Just to make sure we use the same flags before and after. Hi @tstuefe, are you OK with the changes as they are now? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2082173662 From jsjolen at openjdk.org Mon Apr 29 08:51:14 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 08:51:14 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:51:38 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 48: > >> 46: NativeCallStackStorage::StackIndex sidx = _stack_storage.push(stack); >> 47: DeviceSpace::Metadata metadata(sidx, flag); >> 48: DeviceSpace::SummaryDiff diff = device->_tree.reserve_mapping(offset, size, metadata); > > What if `size == 0`? Added test, tree should handle it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1582722773 From dnsimon at openjdk.org Mon Apr 29 08:53:05 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 29 Apr 2024 08:53:05 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: <2DurRx2sxaKxnkJTKVqsFU4jL6Crc05p8HPnFVPvDPw=.71bff5c9-9a40-4898-a83e-ee2dc769fb51@github.com> On Sun, 28 Apr 2024 23:37:22 GMT, Vladimir Kozlov wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Call `vm_exit_out_of_memory()` if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments. Moved jvmci_data back to mutable data section. Marked as reviewed by dnsimon (Reviewer). JVMCI changes now look good to me. ------------- PR Review: https://git.openjdk.org/jdk/pull/18984#pullrequestreview-2027884309 PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2082185011 From jsjolen at openjdk.org Mon Apr 29 08:56:18 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 08:56:18 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:45:18 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 43: > >> 41: } >> 42: >> 43: void MemoryFileTracker::allocate_memory(MemoryFile* device, size_t offset, > > check/assert `device == nullptr`. This kind of check is in `MemTracker`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1582727559 From jsjolen at openjdk.org Mon Apr 29 08:56:18 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 08:56:18 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: <2UMNj1LkcFJOj5bIOi8wJuscaXrGIHzPvlVTIpI-bw4=.38340e91-571b-4cff-8ffa-e32d602395a8@github.com> On Fri, 26 Apr 2024 11:41:57 GMT, Thomas Stuefe wrote: >> I've been using `size_t` so far to indicate that we are within some file with some offset. I'm not sure that `address` is ever the right choice for `VMATree` as it is a `uchar*`, indicating that it's a directly dereferencable pointer. It's not a huge deal whether we choose `size_t`, `uintptr_t` or `address` for our internal representation IMHO, as long as the external interface (`MemTracker`) correctly indicates what kind of address is expected. >> >> @tstuefe, @gerard-ziemski. This discussion is easily lost in the sea of comments, so pinging you directly here. > > How about making your own index type? Something that clearly distinguishes it from sizes. Can be a simple typedef. > > I think address would be wrong. But size_t is also feeling off. I know we use size_t in other places as index or offset, but it still throws me off, I think of size_t as a size, not an offset. I'm fine with `typedef`:ing `size_t`, but I'd like a naming suggestion from you if that's alright. Naming isn't my strong suit and I'd prefer only doing the rename once :). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1582725543 From jsjolen at openjdk.org Mon Apr 29 09:20:41 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 09:20:41 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v51] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with three additional commits since the last revision: - Use diff.flag[i] directly - Add a test that states that reserving or committing an empty region should result in no change in the tree - Copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/dc9741ec..cbe00a31 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=50 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=49-50 Stats: 14 lines in 3 files changed: 9 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From ayang at openjdk.org Mon Apr 29 10:10:14 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 Apr 2024 10:10:14 GMT Subject: RFR: 8331285: Deprecate and obsolete OldSize Message-ID: Simple deprecating a jvm flag. ------------- Commit messages: - old-size Changes: https://git.openjdk.org/jdk/pull/18994/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18994&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331285 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18994.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18994/head:pull/18994 PR: https://git.openjdk.org/jdk/pull/18994 From dnsimon at openjdk.org Mon Apr 29 12:06:04 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 29 Apr 2024 12:06:04 GMT Subject: RFR: 8330817: jdk/internal/vm/Continuation/OSRTest.java times out on libgraal In-Reply-To: References: Message-ID: <92UmeQrydinvBlGWujmrQxG9HN5FBWgU1eoCbB3OHWQ=.3b032777-5250-48a7-9887-3f671e07bcd0@github.com> On Mon, 22 Apr 2024 21:55:38 GMT, Patricio Chilano Mateo wrote: > Small test fix to prevent inlining of foo/fooBigFrame. Tested with Graal repo and verified timeout doesn't happen anymore. > > Thanks, > Patricio Marked as reviewed by dnsimon (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18905#pullrequestreview-2028321830 From jsjolen at openjdk.org Mon Apr 29 12:07:42 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 12:07:42 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v52] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Missing return in upsert causes duplicate keys ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/cbe00a31..34307e11 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=51 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=50-51 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Mon Apr 29 12:20:28 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 12:20:28 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v53] In-Reply-To: References: Message-ID: <5x6_t9y98yKLvtiRHsNt5-UC4vJrcIjWidXh-Mm07Mk=.909c0b78-8283-4096-b85a-aa78db9ed8d3@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Explicitly handle 0-sized mappings as no-ops ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/34307e11..be2f03a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=52 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=51-52 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Mon Apr 29 12:32:39 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 12:32:39 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: assert device != nullptr in MemoryFileTracker::instance ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/be2f03a9..ed01d703 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=53 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=52-53 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Mon Apr 29 12:32:39 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 12:32:39 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Thu, 25 Apr 2024 10:10:42 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/memTracker.hpp line 184: >> >>> 182: } >>> 183: >>> 184: static inline void allocate_memory_in(MemoryFileTracker::MemoryFile* device, size_t offset, size_t size, >> >> invalid args: `nullptr` and `size == 0`. > > We should add tests for `size == 0`, but in general I don't think that's a case that we should disallow. This allows for more generic code where the caller doesn't have to special-case the size being 0. Checking for `nullptr` is a good idea in these outer functions. I added a test for `size == 0` and I handle it explicitly as a no-op. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583007104 From jsjolen at openjdk.org Mon Apr 29 12:32:40 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 12:32:40 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 12:55:33 GMT, Afshin Zafari wrote: >> Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove faulty condition after removing merging >> - Add failing test case > > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 65: > >> 63: } >> 64: >> 65: void MemoryFileTracker::print_report_on(const MemoryFile* device, outputStream* stream, size_t scale) { > > check for invalid arguments: `nullptr` and `0`. We let scale being 0 slide through, not up to our interface to decide if that's valid or not. We do nullptr checks in the `MemoryFileTracker::Instance` case, as that is the 'public' API. > src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 121: > >> 119: } >> 120: >> 121: void MemoryFileTracker::Instance::allocate_memory(MemoryFile* device, size_t offset, > > check invalid args. Done in MemTracker ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583011006 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583011535 From jsjolen at openjdk.org Mon Apr 29 12:32:40 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 29 Apr 2024 12:32:40 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <0G_oRg-MB6aRKXpHJ4ca8lIQ72ZhsA2WBujtJ8BQaD0=.bbbc53c2-cb49-4051-998e-e9e48e4ea516@github.com> Message-ID: On Sun, 21 Apr 2024 10:15:21 GMT, Johan Sj?len wrote: >> Could the comparator be a funktor then? Something with a static compare function? > > Sure, I've got a commit changing it out now. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583012374 From aboldtch at openjdk.org Mon Apr 29 13:40:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Apr 2024 13:40:29 GMT Subject: RFR: 8330253: Skip verify_consistent_lock_order when deoptimizing from monitorenter bytecode. [v7] In-Reply-To: References: Message-ID: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work then deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. This patch will skip the verification when this occurs. > > Currently have only seen this reproduce with JVMTI when deoptimization occurs while a java thread is waiting on a contended monitor. However this could potentially be triggered from a VM entry slow path, so simply checking `current_pending_monitor` could be flaky as well. So instead simply avoid verification. > > Tested Tier 1-8 + Stress testing reproducers. Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: - Revert verification - Revert "repro for JDK-8330253" This reverts commit 10d70ea18c5bad0627b9eaae88fa8c96e436cb3b. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/578a8322..be0a6883 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=05-06 Stats: 149 lines in 4 files changed: 0 ins; 149 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From aboldtch at openjdk.org Mon Apr 29 13:52:23 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Apr 2024 13:52:23 GMT Subject: RFR: 8330253: Remove verify_consistent_lock_order [v8] In-Reply-To: References: Message-ID: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work when deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. Because the correct solution is still not obvious and this test is currently only causing false positives, remove the verification for now. > > Redo this verification in [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307). Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - Merge tag 'jdk-23+20' into JDK-8330164 Added tag jdk-23+20 for changeset 87e864bf - Revert verification - Revert "repro for JDK-8330253" This reverts commit 10d70ea18c5bad0627b9eaae88fa8c96e436cb3b. - Fix condition - Use raw bytecode read for previous bytecode - repro for JDK-8330253 - Whitespace - Spelling and typos - Handle previous bc being monitorenter - Remove implicit conditions - ... and 1 more: https://git.openjdk.org/jdk/compare/d73b29e7...d913ecbe ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18782/files - new: https://git.openjdk.org/jdk/pull/18782/files/be0a6883..d913ecbe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18782&range=06-07 Stats: 58533 lines in 773 files changed: 29910 ins; 25491 del; 3132 mod Patch: https://git.openjdk.org/jdk/pull/18782.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18782/head:pull/18782 PR: https://git.openjdk.org/jdk/pull/18782 From aboldtch at openjdk.org Mon Apr 29 13:52:23 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Apr 2024 13:52:23 GMT Subject: RFR: 8330253: Remove verify_consistent_lock_order [v7] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 13:40:29 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work when deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. Because the correct solution is still not obvious and this test is currently only causing false positives, remove the verification for now. >> >> Redo this verification in [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307). > > Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision: > > - Revert verification > - Revert "repro for JDK-8330253" > > This reverts commit 10d70ea18c5bad0627b9eaae88fa8c96e436cb3b. There were some additional issues with 10d70ea that has to be resolved before this could go in. And multiple engineers in this area agree that the bytecode filter is probably not the way to solve this. [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307) has been created to redo `verify_consistent_lock_order`. Because this issue is creating false positives in testing I propose that we first remove the verification so it can be redone. Reverted the test and removed the verification logic. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2082798633 From pchilanomate at openjdk.org Mon Apr 29 14:14:10 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 29 Apr 2024 14:14:10 GMT Subject: RFR: 8330253: Remove verify_consistent_lock_order [v8] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 13:52:23 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work when deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. Because the correct solution is still not obvious and this test is currently only causing false positives, remove the verification for now. >> >> Redo this verification in [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307). > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - Merge tag 'jdk-23+20' into JDK-8330164 > > Added tag jdk-23+20 for changeset 87e864bf > - Revert verification > - Revert "repro for JDK-8330253" > > This reverts commit 10d70ea18c5bad0627b9eaae88fa8c96e436cb3b. > - Fix condition > - Use raw bytecode read for previous bytecode > - repro for JDK-8330253 > - Whitespace > - Spelling and typos > - Handle previous bc being monitorenter > - Remove implicit conditions > - ... and 1 more: https://git.openjdk.org/jdk/compare/d20db530...d913ecbe Marked as reviewed by pchilanomate (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18782#pullrequestreview-2028638432 From dnsimon at openjdk.org Mon Apr 29 14:37:15 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 29 Apr 2024 14:37:15 GMT Subject: RFR: 8330253: Remove verify_consistent_lock_order [v8] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 13:52:23 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work when deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. Because the correct solution is still not obvious and this test is currently only causing false positives, remove the verification for now. >> >> Redo this verification in [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307). > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - Merge tag 'jdk-23+20' into JDK-8330164 > > Added tag jdk-23+20 for changeset 87e864bf > - Revert verification > - Revert "repro for JDK-8330253" > > This reverts commit 10d70ea18c5bad0627b9eaae88fa8c96e436cb3b. > - Fix condition > - Use raw bytecode read for previous bytecode > - repro for JDK-8330253 > - Whitespace > - Spelling and typos > - Handle previous bc being monitorenter > - Remove implicit conditions > - ... and 1 more: https://git.openjdk.org/jdk/compare/d6848d7d...d913ecbe Marked as reviewed by dnsimon (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18782#pullrequestreview-2028698523 From kvn at openjdk.org Mon Apr 29 14:54:06 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Apr 2024 14:54:06 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 06:33:09 GMT, Tobias Hartmann wrote: > Looks good to me. Did you measure any impact on performance (potentially due to improved code density)? Thank you, @TobiHartmann, for review. > What's left for [JDK-7072317](https://bugs.openjdk.org/browse/JDK-7072317)? Make Relocation info (10% of nmethod size) immutable by moving all encoded pointers (external words and others, which we need to patch in Leyden when loading cached code) from it into separate mutable section. > I wonder if the `CHECKED_CAST` changes shouldn't go into a separate RFE. No, I want clear information which cast failed instead of trying to reproduce very intermittent failure like this: [JDK-8331253](https://bugs.openjdk.org/browse/JDK-8331253). When you have several `checked_cast` in one method it is impossible to find which failed without this macro. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2082953366 From duke at openjdk.org Mon Apr 29 15:17:11 2024 From: duke at openjdk.org (Liming Liu) Date: Mon, 29 Apr 2024 15:17:11 GMT Subject: Integrated: 8324776: runtime/os/TestTransparentHugePageUsage.java fails with The usage of THP is not enough In-Reply-To: References: Message-ID: On Tue, 16 Apr 2024 08:57:48 GMT, Liming Liu wrote: > This PR remove the testcase introduced in JDK-8315923, as we could not find a reliable way to evaluate the usage of THP. We have tried the following methods: > > 1. transverse /proc/self/smaps rather than looking up the first map covered by the heap, as we found there can be multiple sections in /proc/self/smaps for the heap; (https://github.com/limingliu-ampere/jdk/commit/c5b0c4cdf9fa42988faa9fee6ee004ebb599d40a) > 2. take the mode of de-fragment and the enabling of khugepaged into account rather than just THP mode, as THP may not be available immediately when the de-fragment mode is neither "always" nor "madvise", or khugepaged does not collapse pages; (https://github.com/limingliu-ampere/jdk/commit/9c70e9384325b44e074a9e8973846343b27fd2cc) > 3. call madvise with MADV_HUGEPAGE unconditionally rather than calling it only when THP mode is not "always", and adjust the sizes of young and old generations to ensure the parameters are aligned with THP; (https://github.com/limingliu-ampere/jdk/commit/de9607ff64cc526bca9968b72a7065888c2f944d) > 4. check the changes of system-wide counters like thp_* in /proc/vmstat before and after pretouch via gtest. (https://github.com/limingliu-ampere/jdk/commit/bc83e19a682156ee7d09bf939c2b18f3d8c79e22) > > But none of them helps. The amount of THP keeps zero on Oracle CI, although the THP mode is "always", the de-fragment mode is "madvise" and khugepaged is enabled. Furthermore, none of thp counters changed around pretouch. However, we tried the same kernel (5.15-UEK) as Oracle CI on our machine, and found that these methods do help. Thus, we decided to remove this testcase. This pull request has now been integrated. Changeset: 8b8fb642 Author: Liming Liu Committer: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/8b8fb6427e3cbc16b818ddcbd6a971f3d2370f94 Stats: 106 lines in 2 files changed: 0 ins; 106 del; 0 mod 8324776: runtime/os/TestTransparentHugePageUsage.java fails with The usage of THP is not enough Reviewed-by: stuefe ------------- PR: https://git.openjdk.org/jdk/pull/18792 From kvn at openjdk.org Mon Apr 29 16:02:17 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Apr 2024 16:02:17 GMT Subject: Integrated: 8331087: Move immutable nmethod data from CodeCache In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 21:16:03 GMT, Vladimir Kozlov wrote: > Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations`. It amounts for about 30% (optimized VM) of space in CodeCache. > > Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Call `vm_exit_out_of_memory()` if allocation failed. > > Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. > > Tested tier1-5, stress,xcomp > > Our performance testing does not show difference. > > Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. This pull request has now been integrated. Changeset: bdcc2400 Author: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/bdcc2400db63e604d76f9b5bd3c876271743f69f Stats: 311 lines in 5 files changed: 163 ins; 42 del; 106 mod 8331087: Move immutable nmethod data from CodeCache Reviewed-by: thartmann, dlong, dnsimon ------------- PR: https://git.openjdk.org/jdk/pull/18984 From kvn at openjdk.org Mon Apr 29 16:02:17 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 29 Apr 2024 16:02:17 GMT Subject: RFR: 8331087: Move immutable nmethod data from CodeCache [v2] In-Reply-To: References: Message-ID: On Sun, 28 Apr 2024 23:37:22 GMT, Vladimir Kozlov wrote: >> Move immutable nmethod's data from CodeCache to C heap. It includes `dependencies, nul_chk_table, handler_table, scopes_pcs, scopes_data, speculations`. It amounts for about 30% (optimized VM) of space in CodeCache. >> >> Use HotSpot's `os::malloc()` to allocate memory in C heap for immutable nmethod's data. Call `vm_exit_out_of_memory()` if allocation failed. >> >> Shuffle fields order and change some fields size from 4 to 2 bytes to avoid nmethod's header size increase. >> >> Tested tier1-5, stress,xcomp >> >> Our performance testing does not show difference. >> >> Example of updated `-XX:+PrintNMethodStatistics` output is in JBS comment. > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Address comments. Moved jvmci_data back to mutable data section. Thank you, Dean, Doug and Tobias for reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18984#issuecomment-2083102730 From jkern at openjdk.org Mon Apr 29 16:20:14 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 29 Apr 2024 16:20:14 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Thu, 18 Apr 2024 04:26:21 GMT, Kim Barrett wrote: >> I opened https://bugs.openjdk.org/browse/JDK-8330539 so we don't lose track of this, but we can keep the discussion/voting here. > > For the impatient, I suggest adopting mechanism 2, i.e. unconditionally > include in globalDefinitions_gcc.hpp. > > We can't include in shared code, and there is a use in shared code > (in the relatively recently added JavaThread::pretouch_stack). > > When I questioned whether we needed to include at all, I referred > to a Linux man page I'd found on the internet (the same page mdoerr linked > to), which says (in part) > > "By default, modern compilers automatically translate all uses of alloca() > into the built-in ..." > > Apparently I should have kept digging, because it seems that page is > old/incorrect. A seemingly more recent Linux man page describes a different > way of handling it that is closer to what we're seeing, but still not quite > correct. > > glibc's includes if __USE_MISC is defined. > One of the ways __USE_MISC can become defined is if _GNU_SOURCE is defined, > and we define that for both gcc and clang toolchains. > > We include in globalDefinitions_gcc.hpp. So when building with gcc, > globalDefinitions.hpp implicitly includes . > > The glibc definition of alloca is > > #ifdef __GNUC__ > # define alloca(size) __builtin_alloca (size) > #endif /* GCC. */ > > So that explains why we don't need any explicit include of when > building with gcc. I expect there's something similar going on with Visual > Studio and Xcode/clang. But apparently not with Open XLC clang. On AIX `stdlib.h` also would define `alloca`, if `__STRICT_ANSI__` wouldn't be set. 780 #if !defined(__xlC__) || defined(__ibmxl__) || defined(__cplusplus) 781 #if defined(__IBMCPP__) && !defined(__ibmxl__) 782 extern "builtin" char *__alloca (size_t); 783 # define alloca __alloca 784 #elif defined(__GNUC__) && !defined(__STRICT_ANSI__) 785 #undef alloca 786 #define alloca(size) __builtin_alloca (size) 787 #endif A small plain Testprogramm not using all of the flags we used in jdk build, does not set `__STRICT_ANSI__` and then `alloca` is defined correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1583360569 From aboldtch at openjdk.org Mon Apr 29 17:17:14 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Apr 2024 17:17:14 GMT Subject: RFR: 8330253: Remove verify_consistent_lock_order [v8] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 13:52:23 GMT, Axel Boldt-Christmas wrote: >> The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work when deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. Because the correct solution is still not obvious and this test is currently only causing false positives, remove the verification for now. >> >> Redo this verification in [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307). >> >> Removal tested GHA and tier 1-3. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - Merge tag 'jdk-23+20' into JDK-8330164 > > Added tag jdk-23+20 for changeset 87e864bf > - Revert verification > - Revert "repro for JDK-8330253" > > This reverts commit 10d70ea18c5bad0627b9eaae88fa8c96e436cb3b. > - Fix condition > - Use raw bytecode read for previous bytecode > - repro for JDK-8330253 > - Whitespace > - Spelling and typos > - Handle previous bc being monitorenter > - Remove implicit conditions > - ... and 1 more: https://git.openjdk.org/jdk/compare/1298de47...d913ecbe Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18782#issuecomment-2083247933 From aboldtch at openjdk.org Mon Apr 29 17:17:14 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 29 Apr 2024 17:17:14 GMT Subject: Integrated: 8330253: Remove verify_consistent_lock_order In-Reply-To: References: Message-ID: On Mon, 15 Apr 2024 09:15:44 GMT, Axel Boldt-Christmas wrote: > The verification added in [JDK-8329757](https://bugs.openjdk.org/browse/JDK-8329757) will not work when deoptimization occurs on a monitorenter bytecode. The locking may be in a transitional state. Because the correct solution is still not obvious and this test is currently only causing false positives, remove the verification for now. > > Redo this verification in [JDK-8331307](https://bugs.openjdk.org/browse/JDK-8331307). > > Removal tested GHA and tier 1-3. This pull request has now been integrated. Changeset: 9b423a85 Author: Axel Boldt-Christmas URL: https://git.openjdk.org/jdk/commit/9b423a8509d6bf8a76297d74aaaea40613f5f2ae Stats: 70 lines in 3 files changed: 0 ins; 70 del; 0 mod 8330253: Remove verify_consistent_lock_order Co-authored-by: Patricio Chilano Mateo Reviewed-by: dcubed, pchilanomate, dnsimon ------------- PR: https://git.openjdk.org/jdk/pull/18782 From shade at openjdk.org Mon Apr 29 17:40:05 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Apr 2024 17:40:05 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v3] In-Reply-To: References: <17gipHM6B5g7uDlXUwE1lpgXSPKbkOeZAPd60uiEzgY=.c29f487e-1871-493e-9555-faea3c995068@github.com> Message-ID: On Fri, 12 Apr 2024 22:08:42 GMT, Andrew Haley wrote: >>> > Caught some failures, which made me think we want richer diagnostics around this. With new version, we print stuff like: >>> > ``` >>> > # Internal Error (/Users/shipilev/Work/shipilev-jdk/src/hotspot/share/opto/loopnode.cpp:2965), pid=32195, tid=27139 >>> > # Error: ABS: argument should not allow overflow >>> > ``` >>> >>> LOL, don't say you weren't warned! ;-) >>> >>> ``` >>> T res = (x < 0 && x != std::numeric_limits::min()) ? -x : x; >>> ``` >> >> I mean, we catch the proper error in some tests: https://bugs.openjdk.org/browse/JDK-8330158 >> Do we really need to do this `x != std::numeric_limits::min()` dance here? > >> > ``` >> > T res = (x < 0 && x != std::numeric_limits::min()) ? -x : x; >> > ``` >> >> I mean, we catch the proper error in some tests: https://bugs.openjdk.org/browse/JDK-8330158 Do we really need to do this `x != std::numeric_limits::min()` dance here? > > I think so. Several of us have worked on eliminating undefined behaviour in HotSpot and we've made good progress. I think it would be sad for new UB to be pushed now, especially in a case like this when it wouldn't be accidental. UB is just something we have to deal with, because C++. :-( All tests, including aggressive compiler tests, are passing. I see no test failures on new asserts anymore. There are a few Fuzzer test failures due to #19001, which I don't think hide any new assert triggers. @theRealAph @dean-long -- still good with this? If so, I'll integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2083289046 From szaldana at openjdk.org Mon Apr 29 18:12:12 2024 From: szaldana at openjdk.org (Sonia Zaldana Calles) Date: Mon, 29 Apr 2024 18:12:12 GMT Subject: RFR: 8326085: Remove unnecessary UpcallContext constructor Message-ID: Hi all, This PR removes the explicit constructor to UpcallContext (hotspot/share/prims/upcallLinker.cpp) that was added as workaround for [8286891](https://bugs.openjdk.org/browse/JDK-8286891). The minimum required version of XLC has since been bumped in [8325880](https://bugs.openjdk.org/browse/JDK-8325880), so we can remove this. Thanks, Sonia ------------- Commit messages: - 8326085: Remove unnecessary UpcallContext constructor Changes: https://git.openjdk.org/jdk/pull/18982/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18982&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8326085 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18982.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18982/head:pull/18982 PR: https://git.openjdk.org/jdk/pull/18982 From gziemski at openjdk.org Mon Apr 29 20:52:12 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Mon, 29 Apr 2024 20:52:12 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 12:32:39 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > assert device != nullptr in MemoryFileTracker::instance Still working on it and have more to go over, but here is the feedback for now. src/hotspot/share/nmt/nmtMemoryFileTracker.cpp line 53: > 51: summary->reserve_memory(diff.flag[i].reserve); > 52: } > 53: } I'm probably missing something here: Why do we need to mark the reservation for all nmt flag types, when the API takes a single flag as argument here? If later I was to iterate over all flags, all summaries would show the same reservation? Isn't that simply wrong? src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 52: > 50: }; > 51: NativeCallStack* put(const NativeCallStack& value) { > 52: int bucket = value.calculate_hash() % nr_buckets; `calculate_hash()` is: for (int i = 0; i < NMT_TrackingStackDepth; i++) { hash += (uintptr_t)_stack[i]; } Wouldn't XOR serve us better here than plain "+" ? src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 67: > 65: // 4096 buckets ensures that probability of collision is 50% at approximately 64 > 66: // different call stacks. > 67: static const constexpr int nr_buckets = 4096; Shouldn't that be a prime number optimally, ex. 4099? (ideally Marsenne prime, but there is one at 127 then next one is 8191) src/hotspot/share/nmt/vmatree.cpp line 29: > 27: #include "utilities/growableArray.hpp" > 28: > 29: VMATree::SummaryDiff VMATree::register_mapping(size_t A, size_t B, StateType state, Does `VMATree` stand for "Virtual Memory Allocation Tree"? We have some long name already in nmt, ex: `NativeCallStackStorage`, `MemoryFileTracker`, `MallocSiteHashtableEntry`, `MemSummaryDiffReporter`. Can we then name it: `VirtualMemAllocationTree` or `VirtualMemAllocTree` or `VirtMemAllocTree` ? ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2028962210 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583719980 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583356517 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583385203 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1583635036 From kbarrett at openjdk.org Mon Apr 29 22:57:04 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 29 Apr 2024 22:57:04 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v4] In-Reply-To: References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: On Mon, 29 Apr 2024 08:14:03 GMT, Thomas Schatzl wrote: > mach5 higher tier SA tests are fine. What are your plans for the remaining SA renames (would highly recommend to add) and the G1HeapRegion related helper classes? I suggest the related helper classes be done in further followups, not make this change even larger. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18871#issuecomment-2083822517 From kbarrett at openjdk.org Mon Apr 29 23:01:05 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 29 Apr 2024 23:01:05 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray [v3] In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 14:46:08 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. >> >> >> Some example code: >> ```c++ >> // Before this patch this worked: >> GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s >> int& x = arr.at(7); >> if (x == -1) { >> x = 2; >> } >> assert(arr.at(7) == 2, "this holds"); >> // but this was forbidden >> int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& >> // so we had to do >> int x = arr.at_grow(9, -1); >> if (x == -1) { >> arr.at_put(9, 2); >> } >> >> >> Thanks. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Small mistakes - FIXED! src/hotspot/share/utilities/growableArray.hpp line 153: > 151: E* adr_at(int i) const { > 152: assert(0 <= i && i < _len, "illegal index %d for length %d", i, _len); > 153: return &_data[i]; (GitHub won't let me put comment on the `adr_at` signature.) I think there should similarly be const and non-const adr_at, returning pointer to const and non-const respectively. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18975#discussion_r1583880213 From dlong at openjdk.org Mon Apr 29 23:15:09 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 29 Apr 2024 23:15:09 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v6] In-Reply-To: <9r6p7oHNH_9hg_jzOxFh2lKDDS8a1hwTRIIDjfCWOeU=.a30a43cb-426b-4d66-93d6-bb55ab2a8445@github.com> References: <9r6p7oHNH_9hg_jzOxFh2lKDDS8a1hwTRIIDjfCWOeU=.a30a43cb-426b-4d66-93d6-bb55ab2a8445@github.com> Message-ID: <9b7DiQJkRkt7iaW9IcfU1TKDltvcSI9PIg6giH4aI5Y=.9fea2344-c13c-4a9a-aab8-eb94eacae1c7@github.com> On Mon, 29 Apr 2024 08:18:51 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [x] Linux x86_64 server fastdebug, `all` >> - [x] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [x] Linux x86_64 server fastdebug, Maven CTW > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8328934-abs-legal > - Also tests > - Drop the other check; dodge UB > - More straightforward > - Richer error reporting > - Only assert integral type arguments > - Need explicit include as well > - Fix Yes, still good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2083839172 From dlong at openjdk.org Mon Apr 29 23:52:04 2024 From: dlong at openjdk.org (Dean Long) Date: Mon, 29 Apr 2024 23:52:04 GMT Subject: RFR: 8330817: jdk/internal/vm/Continuation/OSRTest.java times out on libgraal In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 21:55:38 GMT, Patricio Chilano Mateo wrote: > Small test fix to prevent inlining of foo/fooBigFrame. Tested with Graal repo and verified timeout doesn't happen anymore. > > Thanks, > Patricio Makes sense. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18905#pullrequestreview-2029913585 From sspitsyn at openjdk.org Tue Apr 30 01:38:05 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 30 Apr 2024 01:38:05 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v2] In-Reply-To: References: Message-ID: <0o3TzcU6vozIPUYiF7iG9ZZ2t1HFNrJMsn2eivviKJU=.be506b1b-bec9-45ab-bd19-1101be512be8@github.com> On Fri, 26 Apr 2024 19:38:40 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: fixed minor issues: renamed function, corrected comment, removed typo in assert > > src/hotspot/share/prims/jvmtiThreadState.cpp line 433: > >> 431: // Avoid using MonitorLocker on performance critical path, use >> 432: // two-level synchronization with lock-free operations on counters. >> 433: assert(!thread->VTMS_transition_mark(), "sanity check"); > > The "counters" comment needs to be updated. Nice catch, thanks. Fixed now. > src/hotspot/share/prims/jvmtiThreadState.cpp line 456: > >> 454: // Slow path: undo unsuccessful optimistic counter incrementation. >> 455: // It can cause an extra waiting cycle for VTMS transition disablers. >> 456: thread->set_VTMS_transition_mark(false); > > The "optimistic counter incrementation" comment needs updating. Nice catch, thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1584018624 PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1584018405 From duke at openjdk.org Tue Apr 30 01:38:15 2024 From: duke at openjdk.org (jjscl8888) Date: Tue, 30 Apr 2024 01:38:15 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 07:47:35 GMT, Thomas Schatzl wrote: > > I observed a phenomenon on our application. When there is no traffic on a certain instance, the number of old generation objects suddenly increases at a certain moment. After dumping the object instances, I found a large number of jdk.internal.vm.FillerArray objects, occupying more than 10G of memory. Have you ever encountered this? > > You probably did the heap dump right after G1 managed to free lots of memory - these `FillerArray` elements represent unused memory within regions. > > Previously one would have seen a huge amount of int-arrays (`[I`) staying around. > > Normally G1 would then incrementally reduce these `FillerArray`s in subsequent mixed collections (after marking etc) by evacuating the live objects between these filler objects, making these regions completely empty. > > From the given output of `jmap` and not knowing what else is going on with the GC this behavior seems normal. ![image](https://github.com/openjdk/jdk/assets/32790117/55cef27f-06f0-435c-b60f-3f5e48a5c5b4) Thank you for your clarification. if the instance in question had no traffic but you observed a sudden increase in the old generation size at 2:35 in the graph, and subsequent garbage collections (GCs) did not reduce the size of the old generation back to its original value ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-2084098768 From sspitsyn at openjdk.org Tue Apr 30 01:52:09 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 30 Apr 2024 01:52:09 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v2] In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 19:34:55 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: fixed minor issues: renamed function, corrected comment, removed typo in assert > > src/hotspot/share/prims/jvmtiThreadState.cpp line 366: > >> 364: attempts--; >> 365: } >> 366: DEBUG_ONLY(if (attempts == 0) break;) > > Previously `_VTMS_transition_count` considered all threads at the same time. Now you are iterating through the threads and looking at a flag in each one. Is it guaranteed that once the `_VTMS_transition_mark` flag has been verified not to be set in a thread it won't get set while still iterating in the threads loop? Thank you for the comment. It is thinking in a right direction. Each `JavaThread` set the `VTM_transition_mark` only once and then checks for disable counters: - `_VTMS_transition_disable_for_all_count` - `java_lang_Thread::VTMS_transition_disable_count(vth())` If any of the disable counters is not zero then each `JavaThread` clears the optimistically set mark and continues under protection of the `JvmtiVTMSTransition_lock`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18937#discussion_r1584025182 From sspitsyn at openjdk.org Tue Apr 30 01:56:13 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 30 Apr 2024 01:56:13 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v3] In-Reply-To: References: Message-ID: > This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. > For performance analysis, experiments and numbers, please, see the comment below this description. > > The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. > > Testing: > - Tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: correct comments related to VTMS transition counters ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18937/files - new: https://git.openjdk.org/jdk/pull/18937/files/03bcfecb..173840b5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18937&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18937&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18937.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18937/head:pull/18937 PR: https://git.openjdk.org/jdk/pull/18937 From jkratochvil at openjdk.org Tue Apr 30 02:06:28 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 30 Apr 2024 02:06:28 GMT Subject: RFR: 8331352: error: template-id not allowed for constructor/destructor in C++20 Message-ID: When compiling trunk (819f3d6fc70ff6fe54ac5f9033c17c3dd4326aa5 2024-04-29) by gcc-14.0.1-0.15.fc40.x86_64 there are many errors: In file included from src/hotspot/share/memory/allocation.hpp:30, from src/hotspot/share/ci/ciBaseObject.hpp:29, from src/hotspot/share/ci/ciMetadata.hpp:28, from src/hotspot/share/ci/ciType.hpp:28, from src/hotspot/share/ci/ciKlass.hpp:28, from src/hotspot/share/ci/ciArrayKlass.hpp:28, from src/hotspot/share/ci/ciArray.hpp:28, from src/hotspot/share/ci/compilerInterface.hpp:28, from src/hotspot/share/compiler/abstractCompiler.hpp:28, from src/hotspot/share/compiler/abstractCompiler.cpp:25: src/hotspot/share/utilities/linkedlist.hpp:85:15: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] 85 | NONCOPYABLE(LinkedList); | ^~~~~~~~~~~~~ src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ | ^ src/hotspot/share/utilities/linkedlist.hpp:85:15: note: remove the ?< >? 85 | NONCOPYABLE(LinkedList); | ^~~~~~~~~~~~~ src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ | ^ In file included from src/hotspot/share/gc/z/zGranuleMap.inline.hpp:30, from src/hotspot/share/gc/z/zForwardingTable.inline.hpp:32, from src/hotspot/share/gc/z/zHeap.inline.hpp:30, from src/hotspot/share/gc/z/zGeneration.inline.hpp:30, from src/hotspot/share/gc/z/zBarrier.inline.hpp:30, from src/hotspot/share/gc/z/zBarrierSet.inline.hpp:31, from src/hotspot/share/gc/shared/barrierSetConfig.inline.hpp:44, from src/hotspot/share/oops/access.inline.hpp:31, from src/hotspot/share/memory/iterator.inline.hpp:32, from src/hotspot/share/oops/oop.inline.hpp:31, from src/hotspot/share/compiler/abstractDisassembler.cpp:32: src/hotspot/share/gc/z/zArray.inline.hpp:99:21: error: template-id not allowed for destructor in C++20 [-Werror=template-id-cdtor] 99 | ZActivatedArray::~ZActivatedArray() { | ^ src/hotspot/share/gc/z/zArray.inline.hpp:99:21: note: remove the ?< >? In file included from src/hotspot/share/opto/bytecodeInfo.cpp:38: src/hotspot/share/utilities/events.hpp:102:18: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] 102 | EventLogBase(const char* name, const char* handle, int length = LogEventsBufferEntries): | ^ src/hotspot/share/utilities/events.hpp:102:18: note: remove the ?< >? In file included from src/hotspot/share/classfile/metadataOnStackMark.hpp:29, from src/hotspot/share/classfile/classLoaderDataGraph.cpp:30: src/hotspot/share/utilities/chunkedList.hpp:47:20: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] 47 | ChunkedList() : _top(_values), _next_used(nullptr), _next_free(nullptr) {} | ^ src/hotspot/share/utilities/chunkedList.hpp:47:20: note: remove the ?< >? ------------- Commit messages: - 8331352: error: template-id not allowed for constructor/destructor in C++20 Changes: https://git.openjdk.org/jdk/pull/19009/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19009&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331352 Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/19009.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19009/head:pull/19009 PR: https://git.openjdk.org/jdk/pull/19009 From sspitsyn at openjdk.org Tue Apr 30 02:08:04 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 30 Apr 2024 02:08:04 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly In-Reply-To: References: Message-ID: On Sat, 27 Apr 2024 00:01:16 GMT, Alex Menkov wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1976: >> >>> 1974: oop thread_obj = nullptr; >>> 1975: >>> 1976: jvmtiError err = JvmtiEnvBase::get_threadOop_and_JavaThread(tlh.list(), target, current, &java_thread, &thread_obj); >> >> I think a good cleanup would be to also replace `current` with `current_thread`, although I'm not sure how common each are. I see 3 `current` references in this webrev. > > Looks like in JVMTI `current_thread` is more common (and `current` is usually used in runtime :) The plan is to unify this with the approach used by the Runtime team. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18986#discussion_r1584032225 From sspitsyn at openjdk.org Tue Apr 30 02:08:03 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 30 Apr 2024 02:08:03 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 22:59:43 GMT, Alex Menkov wrote: > Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method > > Testing: tier1-6 This looks good in general. Thank you for taking care about it. The Runtime team suggested to use the name `current` for the current thread. The plan is to unify the JVMTI code as well. So, I'd suggest to rename all occurrences of the `current_thread` that you touched to the `current`. It will be a good step in a right direction. I'm already doing it for some time. ------------- PR Review: https://git.openjdk.org/jdk/pull/18986#pullrequestreview-2030037342 From jinguojie.jgj at alibaba-inc.com Tue Apr 30 03:24:03 2024 From: jinguojie.jgj at alibaba-inc.com (Jin Guojie) Date: Tue, 30 Apr 2024 11:24:03 +0800 Subject: =?UTF-8?B?UmVwbHnvvJpBYXJjaDY0OiBDUFVfTW9kZWwgc3VwcG9ydCBmb3IgTmVvdmVyc2UgTjEvTjIv?= =?UTF-8?B?VjEvVjI=?= In-Reply-To: References: Message-ID: <18565b39-db0a-4d49-a5e5-fa52c5fa8e85.jinguojie.jgj@alibaba-inc.com> Hi Andrew, On 2024/4/18 Andrew Haley wrote: > On 4/18/24 03:29, Jin Guojie wrote: >> We wrote a patch to improve the definition of CPU models for Arm Neoverse. > Sure. My immediate reaction is that having separate categories for the Neoverse > CPUs is getting to be rather cumbersome. Clearly they have a lot in common, > and it would be nicer to be able to say things like > ? "if CPU is Arm.Neoverse" or "is Arm.Neoverse.V2" > but right now I can't think of a nice way to do that. Maybe a nested class hierarchy? >> Could you please create an issue in the JDK Bug System (JBS), > I will, but let's have some ideas about what the result should be. We have re-optimized the code style of the Neoverse CPU model definition. To achieve higher compiler compatibility, we used simple judgment logic in vm_version_aarch64.hpp. We also analyzed vm_version_x86.hpp and did not find the "nested class hierarchy" syntax you mentioned. The way X86 uses to determine the CPU type is to define a set of is_xxx() functions, just like the style we use in the patch below. The main program (vm_version_aarch64.cpp) looks more concise now. Looking forward to your suggestions. Thanks. -- Jin Guojie(Alibaba, hotspot developer) diff --git a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp index 18f310c746c..5fc2b5cee2d 100644 --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp @@ -212,13 +212,7 @@ void VM_Version::initialize() { } } - // Neoverse - // N1: 0xd0c - // N2: 0xd49 - // V1: 0xd40 - // V2: 0xd4f - if (_cpu == CPU_ARM && (model_is(0xd0c) || model_is(0xd49) || - model_is(0xd40) || model_is(0xd4f))) { + if (is_neoverse_family()) { if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); } @@ -247,10 +241,7 @@ void VM_Version::initialize() { FLAG_SET_DEFAULT(UseCRC32, false); } - // Neoverse - // V1: 0xd40 - // V2: 0xd4f - if (_cpu == CPU_ARM && (model_is(0xd40) || model_is(0xd4f))) { + if (is_neoverse_v_series()) { if (FLAG_IS_DEFAULT(UseCryptoPmullForCRC32)) { FLAG_SET_DEFAULT(UseCryptoPmullForCRC32, true); } diff --git a/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp b/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp index f6cac72804f..323b7e8e151 100644 --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp @@ -114,6 +114,13 @@ enum Ampere_CPU_Model { CPU_MODEL_AMPERE_1B = 0xac5 /* AMPERE_1B core Implements ARMv8.7 with CSSC, MTE, SM3/SM4 extensions */ }; +enum Neoverse_CPU_Model { + CPU_MODEL_NEOVERSE_N1 = 0xd0c, + CPU_MODEL_NEOVERSE_N2 = 0xd49, + CPU_MODEL_NEOVERSE_V1 = 0xd40, + CPU_MODEL_NEOVERSE_V2 = 0xd4f, +}; + #define CPU_FEATURE_FLAGS(decl) \ decl(FP, fp, 0) \ decl(ASIMD, asimd, 1) \ @@ -156,6 +163,22 @@ enum Ampere_CPU_Model { return _model == cpu_model || _model2 == cpu_model; } + static bool is_neoverse_family() { + return _cpu == CPU_ARM + && (model_is(CPU_MODEL_NEOVERSE_N1) || model_is(CPU_MODEL_NEOVERSE_N2) || + model_is(CPU_MODEL_NEOVERSE_V1) || model_is(CPU_MODEL_NEOVERSE_V2)); + } + + static bool is_neoverse_n_series() { + return is_neoverse_family() && + (model_is(CPU_MODEL_NEOVERSE_N1) || model_is(CPU_MODEL_NEOVERSE_N2)); + } + + static bool is_neoverse_v_series() { + return is_neoverse_family() && + (model_is(CPU_MODEL_NEOVERSE_V1) || model_is(CPU_MODEL_NEOVERSE_V2)); + } + static bool is_zva_enabled() { return 0 <= _zva_length; } static int zva_length() { assert(is_zva_enabled(), "ZVA not available"); From dholmes at openjdk.org Tue Apr 30 03:42:04 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Apr 2024 03:42:04 GMT Subject: RFR: 8331352: error: template-id not allowed for constructor/destructor in C++20 In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 02:01:01 GMT, Jan Kratochvil wrote: > When compiling trunk (819f3d6fc70ff6fe54ac5f9033c17c3dd4326aa5 2024-04-29) by gcc-14.0.1-0.15.fc40.x86_64 there are many errors: > > In file included from src/hotspot/share/memory/allocation.hpp:30, > from src/hotspot/share/ci/ciBaseObject.hpp:29, > from src/hotspot/share/ci/ciMetadata.hpp:28, > from src/hotspot/share/ci/ciType.hpp:28, > from src/hotspot/share/ci/ciKlass.hpp:28, > from src/hotspot/share/ci/ciArrayKlass.hpp:28, > from src/hotspot/share/ci/ciArray.hpp:28, > from src/hotspot/share/ci/compilerInterface.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.cpp:25: > src/hotspot/share/utilities/linkedlist.hpp:85:15: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > src/hotspot/share/utilities/linkedlist.hpp:85:15: note: remove the ?< >? > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > > In file included from src/hotspot/share/gc/z/zGranuleMap.inline.hpp:30, > from src/hotspot/share/gc/z/zForwardingTable.inline.hpp:32, > from src/hotspot/share/gc/z/zHeap.inline.hpp:30, > from src/hotspot/share/gc/z/zGeneration.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrier.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrierSet.inline.hpp:31, > from src/hotspot/share/gc/shared/barrierSetConfig.inline.hpp:44, > from src/hotspot/share/oops/access.inline.hpp:31, > from src/hotspot/share/memory/iterator.inline.hpp:32, > from src/hotspot/share/oops/oop.inline.hpp:31, > from src/hotspot/share/compiler/abstractDisassembler.cpp:32: > src/hotspot/share/gc/z/zArray.inline.hpp:99:21: error: template-id not allowed f... Hotspot does not support C++20 at this time. I don't know if these changes are harmless wrt. C++17 or may cause an issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19009#issuecomment-2084318221 From kbarrett at openjdk.org Tue Apr 30 04:03:04 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 30 Apr 2024 04:03:04 GMT Subject: RFR: 8331352: error: template-id not allowed for constructor/destructor in C++20 In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 02:01:01 GMT, Jan Kratochvil wrote: > When compiling trunk (819f3d6fc70ff6fe54ac5f9033c17c3dd4326aa5 2024-04-29) by gcc-14.0.1-0.15.fc40.x86_64 there are many errors: > > In file included from src/hotspot/share/memory/allocation.hpp:30, > from src/hotspot/share/ci/ciBaseObject.hpp:29, > from src/hotspot/share/ci/ciMetadata.hpp:28, > from src/hotspot/share/ci/ciType.hpp:28, > from src/hotspot/share/ci/ciKlass.hpp:28, > from src/hotspot/share/ci/ciArrayKlass.hpp:28, > from src/hotspot/share/ci/ciArray.hpp:28, > from src/hotspot/share/ci/compilerInterface.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.cpp:25: > src/hotspot/share/utilities/linkedlist.hpp:85:15: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > src/hotspot/share/utilities/linkedlist.hpp:85:15: note: remove the ?< >? > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > > In file included from src/hotspot/share/gc/z/zGranuleMap.inline.hpp:30, > from src/hotspot/share/gc/z/zForwardingTable.inline.hpp:32, > from src/hotspot/share/gc/z/zHeap.inline.hpp:30, > from src/hotspot/share/gc/z/zGeneration.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrier.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrierSet.inline.hpp:31, > from src/hotspot/share/gc/shared/barrierSetConfig.inline.hpp:44, > from src/hotspot/share/oops/access.inline.hpp:31, > from src/hotspot/share/memory/iterator.inline.hpp:32, > from src/hotspot/share/oops/oop.inline.hpp:31, > from src/hotspot/share/compiler/abstractDisassembler.cpp:32: > src/hotspot/share/gc/z/zArray.inline.hpp:99:21: error: template-id not allowed f... The C++20 change comes from DR 2237 https://cplusplus.github.io/CWG/issues/2237.html That resolution says: "Note that this resolution is a change for C++20, NOT a defect report against C++17 and earlier versions." So it seems like a gcc bug that this error is being issues when not using C++20. OTOH, the changes involved seem beneficial to me, similarly to those for JDK-8328997. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19009#pullrequestreview-2030156549 From jkratochvil at openjdk.org Tue Apr 30 04:10:04 2024 From: jkratochvil at openjdk.org (Jan Kratochvil) Date: Tue, 30 Apr 2024 04:10:04 GMT Subject: RFR: 8331352: error: template-id not allowed for constructor/destructor in C++20 In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 02:01:01 GMT, Jan Kratochvil wrote: > When compiling trunk (819f3d6fc70ff6fe54ac5f9033c17c3dd4326aa5 2024-04-29) by gcc-14.0.1-0.15.fc40.x86_64 there are many errors: > > In file included from src/hotspot/share/memory/allocation.hpp:30, > from src/hotspot/share/ci/ciBaseObject.hpp:29, > from src/hotspot/share/ci/ciMetadata.hpp:28, > from src/hotspot/share/ci/ciType.hpp:28, > from src/hotspot/share/ci/ciKlass.hpp:28, > from src/hotspot/share/ci/ciArrayKlass.hpp:28, > from src/hotspot/share/ci/ciArray.hpp:28, > from src/hotspot/share/ci/compilerInterface.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.cpp:25: > src/hotspot/share/utilities/linkedlist.hpp:85:15: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > src/hotspot/share/utilities/linkedlist.hpp:85:15: note: remove the ?< >? > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > > In file included from src/hotspot/share/gc/z/zGranuleMap.inline.hpp:30, > from src/hotspot/share/gc/z/zForwardingTable.inline.hpp:32, > from src/hotspot/share/gc/z/zHeap.inline.hpp:30, > from src/hotspot/share/gc/z/zGeneration.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrier.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrierSet.inline.hpp:31, > from src/hotspot/share/gc/shared/barrierSetConfig.inline.hpp:44, > from src/hotspot/share/oops/access.inline.hpp:31, > from src/hotspot/share/memory/iterator.inline.hpp:32, > from src/hotspot/share/oops/oop.inline.hpp:31, > from src/hotspot/share/compiler/abstractDisassembler.cpp:32: > src/hotspot/share/gc/z/zArray.inline.hpp:99:21: error: template-id not allowed f... https://lists.fedoraproject.org/archives/list/devel at lists.fedoraproject.org/message/BWHDTXFYHQKR5BZH7QUGX7RQGVUK6DN4/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/19009#issuecomment-2084337364 From duke at openjdk.org Tue Apr 30 05:10:34 2024 From: duke at openjdk.org (Liming Liu) Date: Tue, 30 Apr 2024 05:10:34 GMT Subject: RFR: 8324781: runtime/Thread/TestAlwaysPreTouchStacks.java failed with Expected a higher ratio between stack committed and reserved [v8] In-Reply-To: References: Message-ID: > The testcase failed on Oracle CI since JDK-8315923. The root cause is that Oracle CI runs Linux-5.4.17-UEK where the value of MADV_POPULATE_WRITE (23) is used as MADV_DONTEXEC which is not supported by upstream. This PR solves the testcase failure by checking versions of kernels first, and checking the availability of MADV_POPULATE_WRITE when they are not older than 5.14. Liming Liu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge branch 'master' into 8324781 - Mention the number in comments - Use kernel_version - Mis-removed the two lines - Parse kernel versions alone - Disable UseMadvPopulateWrite when not supported - Exit early in os::pd_pretouch_memory and generate a warning when user turns on UseMadvPopulteWrite when not supported - Check kernel versions before check the support of the advice ------------- Changes: https://git.openjdk.org/jdk/pull/18592/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18592&range=07 Stats: 64 lines in 2 files changed: 40 ins; 20 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18592.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18592/head:pull/18592 PR: https://git.openjdk.org/jdk/pull/18592 From dholmes at openjdk.org Tue Apr 30 05:57:05 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Apr 2024 05:57:05 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails In-Reply-To: References: Message-ID: On Tue, 23 Apr 2024 21:11:53 GMT, Doug Simon wrote: > This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries. > > The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here: > > V [jvm.dll+0x62c23a] Exceptions::_throw+0x11a (exceptions.cpp:168) > V [jvm.dll+0x62d85b] Exceptions::_throw_oop+0xab (exceptions.cpp:140) > V [jvm.dll+0xbbce78] MemAllocator::Allocation::check_out_of_memory+0x208 (memAllocator.cpp:138) > V [jvm.dll+0xbbcac8] MemAllocator::allocate+0x158 (memAllocator.cpp:377) > V [jvm.dll+0x79bd05] InstanceKlass::allocate_instance+0x95 (instanceKlass.cpp:1509) > V [jvm.dll+0x7ddeed] java_lang_String::basic_create+0x9d (javaClasses.cpp:273) > V [jvm.dll+0x7e43c0] java_lang_String::create_from_unicode+0x60 (javaClasses.cpp:291) > V [jvm.dll+0xdb91a5] StringTable::do_intern+0xb5 (stringTable.cpp:379) > V [jvm.dll+0xdba9f2] StringTable::intern+0x1b2 (stringTable.cpp:368) > V [jvm.dll+0xdbaaa6] StringTable::intern+0x86 (stringTable.cpp:328) > V [jvm.dll+0x51c8b1] ConstantPool::string_at_impl+0x1d1 (constantPool.cpp:1251) > V [jvm.dll+0x51b95b] ConstantPool::resolve_string_constants_impl+0xeb (constantPool.cpp:800) > V [jvm.dll+0x4f2f8d] CompileBroker::compile_method+0x31d (compileBroker.cpp:1395) > V [jvm.dll+0x4f3474] CompileBroker::compile_method+0xc4 (compileBroker.cpp:1348) > > These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/... So you are generalising (and seemingly simplifying) the notion of a "retryable allocation" so that internally an OOME can be ignored for a range of reasons. It seems a rather elaborate response to the test failure (especially when generating a stacktrace under OOM conditions could itself fail anyway), but I can see the general utility of expanding things this way. I really dislike the name `SandboxedOOMEMark` though - sorry - suggestions: `InternalOOMEMark`, `ScopedOOMEMark`, `ConfinedOOMEMark` ? My main concerns relate to me not understanding the details of the existing retryable allocation, so some of the new code seems a little odd. Comments below. Thanks src/hotspot/share/gc/shared/memAllocator.cpp line 127: > 125: const char* message = _overhead_limit_exceeded ? "GC overhead limit exceeded" : "Java heap space"; > 126: // -XX:+HeapDumpOnOutOfMemoryError and -XX:OnOutOfMemoryError support > 127: report_java_out_of_memory(message); Not obvious we now need this to be unconditional. src/hotspot/share/gc/shared/memAllocator.hpp line 139: > 137: _outer = false; > 138: _thread = nullptr; > 139: } It isn't obvious to me how this part is intended to be used. I see it ties back to the retryable allocation "activate" mode, but I'm unclear what that means as well. src/hotspot/share/oops/klass.cpp line 876: > 874: void Klass::check_array_allocation_length(int length, int max_length, TRAPS) { > 875: if (length > max_length) { > 876: report_java_out_of_memory("Requested array size exceeds VM limit"); Again not obvious this should now be unconditional ------------- PR Review: https://git.openjdk.org/jdk/pull/18925#pullrequestreview-2030233468 PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1584154877 PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1584160406 PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1584164575 From stuefe at openjdk.org Tue Apr 30 06:03:14 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:03:14 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <1BNwYTHgU-eHN44HHfYcnfw3XY_BS43XDnqcgDfNPQo=.afd63b8b-4927-4f89-85b4-35e9794acedd@github.com> On Fri, 19 Apr 2024 09:49:33 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > removed extra blank line. Okay, had another look. Good work. Mostly nits now, but a number of mine and @stefank 's remarks were not addressed. You wrote "Fixed" though. Are you sure you committed your last changes? I still find it regrettable that we are getting rid of default arguments. This makes many of these APIs rather unwieldy. The patch is massive, which will be a heck of a pain for backporters because patches following this patch will be less likely to be clean backports. In a perfect world, one would not call the "raw" os::xxx functions so much and rather use something like ReservedSpace, which then can carry information about the mapping, e.g. the flag. But I never liked ReservedSpace; it feels like old C++ code and often shows surprising behavior. I would love it if someone were to improve that. For example, rewrite initialization to make it more conform to modern C++, and maybe to make it mostly immutable (e.g. Do we really need something like clear_members? Most of the places using clear_members looked to me like they should rely on automatic destructors instead). ReservedSpace is also copied by value, without having a clear semantic of ownership of the underlying mapping. src/hotspot/share/memory/virtualspace.cpp line 613: > 611: } > 612: } > 613: } stray ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2030234166 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584156869 From stuefe at openjdk.org Tue Apr 30 06:03:16 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:03:16 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: On Mon, 15 Apr 2024 16:11:13 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > alignment in coding style changed. src/hotspot/share/cds/metaspaceShared.cpp line 1322: > 1320: os::vm_page_size(), mtClassShared, (char*)base_address); > 1321: class_space_rs = ReservedSpace(class_space_size, class_space_alignment, > 1322: os::vm_page_size(), mtClass, (char*)ccs_base); Note that here, we place two spaces atop of a region that has been previously mapped with mtClass (see e.g. src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp). I assume this is not a problem? src/hotspot/share/memory/virtualspace.cpp line 57: > 55: } > 56: > 57: ReservedSpace::ReservedSpace(size_t size, size_t preferred_page_size, MEMFLAGS flag) : _fd_for_heap(-1), _nmt_flag(flag) { Small nit: Mixture of styles. As much as I dislike it, current style is to initialize things via dedicated initialize methods. I'd rather stay consistent. That said, I would be more than happy for someone to give these classes a once-over and convert them to the more usual style - using initializer lists. Then, we also can make members const that should be const, e.g. _nmt_flags. Not in this PR though. src/hotspot/share/memory/virtualspace.cpp line 623: > 621: } > 622: // _nmt_flag is used internally by initialize_compressed_heap > 623: _nmt_flag = mtJavaHeap; Nit, we use a mixture of directly accessing _nmt_flag and accessing it via getter. Hotspot seems to prefer getters/setters. Can we use setters here? src/hotspot/share/memory/virtualspace.hpp line 46: > 44: int _fd_for_heap; > 45: bool _executable; > 46: MEMFLAGS _nmt_flag; See my remark below. This member, and probably others (e.g. page size and size) could and should probably be const. Food for follow up PRs. src/hotspot/share/memory/virtualspace.hpp line 71: > 69: public: > 70: > 71: MEMFLAGS nmt_flag() { return _nmt_flag; } const method ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584163635 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584153176 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584158038 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584154689 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584150480 From stuefe at openjdk.org Tue Apr 30 06:03:17 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:03:17 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <4p0uq_t37Fkj9fxqD1QC8TOkgAyyW1PVmTknURCquG4=.22b762b8-dea4-4fe3-a19f-d6a3f26c9f27@github.com> Message-ID: On Thu, 18 Apr 2024 08:39:46 GMT, Afshin Zafari wrote: >> src/hotspot/share/classfile/compactHashtable.cpp line 243: >> >>> 241: quit("Unable to open hashtable dump file", filename); >>> 242: } >>> 243: _base = os::map_memory(_fd, filename, 0, nullptr, _size, mtInternal, true, false); >> >> Isn't this CDS code. Should ths be mtClassShared or something else that indicates that this is CDS code? > > Fixed. I don't see the fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584163987 From stuefe at openjdk.org Tue Apr 30 06:03:18 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:03:18 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <3_SnilNMBOpzwdkyaOW4w4QyMfqIjAlR99N0dTBsksc=.d2c0f5dd-4e16-4fd6-9c04-eb8e6ae395ba@github.com> On Tue, 23 Apr 2024 08:54:24 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 112: >> >>> 110: >>> 111: // Commit... >>> 112: if (os::commit_memory((char*)p, word_size * BytesPerWord, !ExecMem, mtMetaspace) == false) { >> >> Not sure if I suggested something different in my first review, but thinking this over, this is wrong. Please don't hardwire mtMetaspace; take the flag from the ReservedSpace member of VirtualSpaceNode. >> >> The reason is that metaspace can be used for at least two different flags, and may later be expanded for more. > > Done. Where? I still see mtMetaspace. >> src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp line 191: >> >>> 189: >>> 190: // Uncommit... >>> 191: if (os::uncommit_memory((char*)p, word_size * BytesPerWord, !ExecMem, mtMetaspace) == false) { >> >> Same here. > > Done. I still see mtMetaspace. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584158963 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584159189 From stuefe at openjdk.org Tue Apr 30 06:03:19 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:03:19 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v12] In-Reply-To: <5yX9I1JoQY9gmxbIvTDPxuxQSu37KHG0LzlL7cq-3iQ=.38c06bf3-699b-466c-b934-aefedb37b17b@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <5yX9I1JoQY9gmxbIvTDPxuxQSu37KHG0LzlL7cq-3iQ=.38c06bf3-699b-466c-b934-aefedb37b17b@github.com> Message-ID: On Fri, 19 Apr 2024 09:46:39 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/virtualspace.cpp line 709: >> >>> 707: assert(max_commit_granularity > 0, "Granularity must be non-zero."); >>> 708: >>> 709: >> >> This blankline should be reverted. > > Done. Still a stray line. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584158376 From dholmes at openjdk.org Tue Apr 30 06:20:04 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 Apr 2024 06:20:04 GMT Subject: RFR: 8331285: Deprecate and obsolete OldSize In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 10:06:38 GMT, Albert Mingkun Yang wrote: > Simple deprecating a jvm flag. Changes requested by dholmes (Reviewer). src/hotspot/share/runtime/arguments.cpp line 507: > 505: { "UseNotificationThread", JDK_Version::jdk(23), JDK_Version::jdk(24), JDK_Version::jdk(25) }, > 506: { "UseEmptySlotsInSupers", JDK_Version::jdk(23), JDK_Version::jdk(24), JDK_Version::jdk(25) }, > 507: { "OldSize", JDK_Version::jdk(23), JDK_Version::jdk(24), JDK_Version::jdk(25) }, This should be at line 504 before PreserveAllAnnotations to keep ordering within release. ------------- PR Review: https://git.openjdk.org/jdk/pull/18994#pullrequestreview-2030291377 PR Review Comment: https://git.openjdk.org/jdk/pull/18994#discussion_r1584186831 From stuefe at openjdk.org Tue Apr 30 06:23:12 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:23:12 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 12:32:39 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > assert device != nullptr in MemoryFileTracker::instance I will take a closer look later today, and also clean out some of my obsolete comments. GH interface is really not made for complex patches. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2084464424 From stuefe at openjdk.org Tue Apr 30 06:23:13 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 06:23:13 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v46] In-Reply-To: <2UMNj1LkcFJOj5bIOi8wJuscaXrGIHzPvlVTIpI-bw4=.38340e91-571b-4cff-8ffa-e32d602395a8@github.com> References: <2UMNj1LkcFJOj5bIOi8wJuscaXrGIHzPvlVTIpI-bw4=.38340e91-571b-4cff-8ffa-e32d602395a8@github.com> Message-ID: On Mon, 29 Apr 2024 08:51:26 GMT, Johan Sj?len wrote: >> How about making your own index type? Something that clearly distinguishes it from sizes. Can be a simple typedef. >> >> I think address would be wrong. But size_t is also feeling off. I know we use size_t in other places as index or offset, but it still throws me off, I think of size_t as a size, not an offset. > > I'm fine with `typedef`:ing `size_t`, but I'd like a naming suggestion from you if that's alright. Naming isn't my strong suit and I'd prefer only doing the rename once :). If the type is defined within VMATree scope, it can be anything short and succinct, e.g. `VMATree::position_t`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584188364 From jinguojie.jgj at alibaba-inc.com Tue Apr 30 06:26:54 2024 From: jinguojie.jgj at alibaba-inc.com (Jin Guojie) Date: Tue, 30 Apr 2024 14:26:54 +0800 Subject: =?UTF-8?B?UmVwbHk6IEFhcmNoNjQ6IG9wdGltYXRpb24gZm9yIGRvaW5nIHJlbWFpbmRlciBvbiBBQXJj?= =?UTF-8?B?aDY0?= In-Reply-To: <4d4d046f-cee5-4427-bcc4-3318dd687599.jinguojie.jgj@alibaba-inc.com> References: <4d4d046f-cee5-4427-bcc4-3318dd687599.jinguojie.jgj@alibaba-inc.com> Message-ID: Hi Andrew, These days we used the microbenchmark that comes with JDK to conduct a more comprehensive performance test on this remainder optimization. The following test has passed, which shows definite performance improvement. make test TEST="micro:java.lang.IntegerDivMod" make test TEST="micro:java.lang.LongDivMod" * IntegerDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2223 with this pacth(ns/ops) 1885 improvement(%) 17.93% * IntegerDivMod.testRemainderUnsigned baseline(ns/ops) 2225 with this pacth(ns/ops) 1885 improvement(%) 18.03% * LongDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2231 with this pacth(ns/ops) 1894 improvement(%) 17.79% * LongDivMod.testRemainderUnsigned baseline(ns/ops) 2232 with this pacth(ns/ops) 1891 improvement(%) 18.03% There is also good news. My OCA application has been approved. Could you please create an issue in the JDK Bug System (JBS) before I submit a PR? Thanks. Jin Guojie (Alibaba, hotspot developer). On 2024/4/ 23:42 Andrew Haley wrote: >> If you can get a Github account and an OpenJDK account we can start to do that. >> The first thing for you to do is clone the OpenJDK repo into your own tree, >> then create a local branch, then create a PR. >> See the section https://openjdk.org/guide/#i-have-a-patch-what-do-i-do > According to this guide, a sponsor needs to first create an issue on JBS before submitting a PR. > Could you please create an issue in the JDK Bug System (JBS)? > I have submitted an OpenJDK account application, but But Oracle has not approved it yet. > I will submit this PR after my OCA is signed and the the issure in JBS is created. From rehn at openjdk.org Tue Apr 30 06:59:23 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 06:59:23 GMT Subject: RFR: 8331360: RISCV: u32 _partial_subtype_ctr loaded/stored as 64 Message-ID: Hi, please consider. We should use incrementw() for these. Sanity tested, running t1. Thanks, Robbin ------------- Commit messages: - use incw Changes: https://git.openjdk.org/jdk/pull/19010/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19010&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331360 Stats: 9 lines in 2 files changed: 0 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/19010.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19010/head:pull/19010 PR: https://git.openjdk.org/jdk/pull/19010 From dnsimon at openjdk.org Tue Apr 30 07:07:04 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 30 Apr 2024 07:07:04 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 05:54:53 GMT, David Holmes wrote: > So you are generalising (and seemingly simplifying) the notion of a "retryable allocation" so that internally an OOME can be ignored for a range of reasons. It seems a rather elaborate response to the test failure (especially when generating a stacktrace under OOM conditions could itself fail anyway), but I can see the general utility of expanding things this way. I really dislike the name `SandboxedOOMEMark` though - sorry - suggestions: `InternalOOMEMark`, `ScopedOOMEMark`, `ConfinedOOMEMark` ? Aren't sandboxed, scoped and confined kind of all the same concept? I don't mind using a different name but want to better understand the specific objection to "sandboxed" first. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18925#issuecomment-2084528115 From tschatzl at openjdk.org Tue Apr 30 07:14:10 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 30 Apr 2024 07:14:10 GMT Subject: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v4] In-Reply-To: References: <3IdWn9VGEERd8v9RcH2E_LzjVo0L8nMfi5jGWmhgVuM=.6b5b3be4-bfbd-4376-9580-48d78d75665c@github.com> Message-ID: On Mon, 29 Apr 2024 22:54:18 GMT, Kim Barrett wrote: > > mach5 higher tier SA tests are fine. What are your plans for the remaining SA renames (would highly recommend to add) and the G1HeapRegion related helper classes? > > I suggest the related helper classes be done in further followups, not make this change even larger. Fine with me, will file an issue about the helper classes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18871#issuecomment-2084539234 From fyang at openjdk.org Tue Apr 30 07:33:06 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 30 Apr 2024 07:33:06 GMT Subject: RFR: 8331360: RISCV: u32 _partial_subtype_ctr loaded/stored as 64 In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 06:54:56 GMT, Robbin Ehn wrote: > Hi, please consider. > > We should use incrementw() for these. > > Sanity tested, running t1. > > Thanks, Robbin Good catch. Looks good! src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3347: > 3345: > 3346: #ifndef PRODUCT > 3347: incrementw(ExternalAddress((address)&SharedRuntime::_partial_subtype_ctr)); I just checked the x86 and aarch64 counterpart. Seems that aarch64 bears the same issue [1] as it uses `ldr` & `str` which load and store 64-bit data items like here. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L1565 ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19010#pullrequestreview-2030430740 PR Review Comment: https://git.openjdk.org/jdk/pull/19010#discussion_r1584269517 From dnsimon at openjdk.org Tue Apr 30 07:34:08 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 30 Apr 2024 07:34:08 GMT Subject: RFR: 8331208: Memory stress test that checks OutOfMemoryError stack trace fails In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 05:33:56 GMT, David Holmes wrote: >> This pull request mitigates failures in memory stress tests that check the stack trace of an `OutOfMemoryError` for certain expected entries. >> >> The stack trace of an OOME will [not be allocated once all preallocated OOMEs are used up](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/memory/universe.cpp#L722). If the only heap allocations performed in stressful conditions are those of the stress test, then the [4 preallocated OOMEs](https://github.com/openjdk/jdk/blob/f1d0e715b67e2ca47b525069d8153abbb33f75b9/src/hotspot/share/runtime/globals.hpp#L800) would be sufficient. However, it's possible for VM internal allocations to also occur during stressful conditions, especially in `-Xcomp` mode. For example, [CompileBroker::compile_method](https://github.com/openjdk/jdk/blob/3d5eeac3a38ece4a23ea6da2dfe5939d64e81cea/src/hotspot/share/compiler/compileBroker.cpp#L1399) will try to resolve the string constants in the constant pool of the method about to be compiled. This can fail as shown here: >> >> V [jvm.dll+0x62c23a] Exceptions::_throw+0x11a (exceptions.cpp:168) >> V [jvm.dll+0x62d85b] Exceptions::_throw_oop+0xab (exceptions.cpp:140) >> V [jvm.dll+0xbbce78] MemAllocator::Allocation::check_out_of_memory+0x208 (memAllocator.cpp:138) >> V [jvm.dll+0xbbcac8] MemAllocator::allocate+0x158 (memAllocator.cpp:377) >> V [jvm.dll+0x79bd05] InstanceKlass::allocate_instance+0x95 (instanceKlass.cpp:1509) >> V [jvm.dll+0x7ddeed] java_lang_String::basic_create+0x9d (javaClasses.cpp:273) >> V [jvm.dll+0x7e43c0] java_lang_String::create_from_unicode+0x60 (javaClasses.cpp:291) >> V [jvm.dll+0xdb91a5] StringTable::do_intern+0xb5 (stringTable.cpp:379) >> V [jvm.dll+0xdba9f2] StringTable::intern+0x1b2 (stringTable.cpp:368) >> V [jvm.dll+0xdbaaa6] StringTable::intern+0x86 (stringTable.cpp:328) >> V [jvm.dll+0x51c8b1] ConstantPool::string_at_impl+0x1d1 (constantPool.cpp:1251) >> V [jvm.dll+0x51b95b] ConstantPool::resolve_string_constants_impl+0xeb (constantPool.cpp:800) >> V [jvm.dll+0x4f2f8d] CompileBroker::compile_method+0x31d (compileBroker.cpp:1395) >> V [jvm.dll+0x4f3474] CompileBroker::compile_method+0xc4 (compileBroker.cpp:1348) >> >> These internal allocations can occur before the allocations of the test and thus use up the pre-allocated OOMEs. As a result, the OOMEs triggered by the stress test may end up throwing the [default, shared OOME instance](https://github.com/openjdk/jdk/blob/3d5eeac3a38ec... > > src/hotspot/share/gc/shared/memAllocator.cpp line 127: > >> 125: const char* message = _overhead_limit_exceeded ? "GC overhead limit exceeded" : "Java heap space"; >> 126: // -XX:+HeapDumpOnOutOfMemoryError and -XX:OnOutOfMemoryError support >> 127: report_java_out_of_memory(message); > > Not obvious we now need this to be unconditional. I think it was a mistake to make it conditional when RetryableAllocationMark was first introduced. The purpose of RAM was to only to resolve a correctness issue wrt to JVMTI (it was seeing the "same" exception being reported twice). The -XX actions do not change the semantics of the exception throwing so can be done unconditionally. > src/hotspot/share/gc/shared/memAllocator.hpp line 139: > >> 137: _outer = false; >> 138: _thread = nullptr; >> 139: } > > It isn't obvious to me how this part is intended to be used. I see it ties back to the retryable allocation "activate" mode, but I'm unclear what that means as well. By "this part", do you mean the `else` branch? It exists for the `!activate` case of RetryableAllocationMark which is used when the `null_on_fail` parameter of `JVMCIRuntime::new_instance_common` is true. That is, the runtime call is from compiled code that does *not* want to trigger throwing of an OOME. Graal will deopt in such cases and let the interpreter throw the exception. This ensures the OOME is reported exactly once to JVMTI. > src/hotspot/share/oops/klass.cpp line 876: > >> 874: void Klass::check_array_allocation_length(int length, int max_length, TRAPS) { >> 875: if (length > max_length) { >> 876: report_java_out_of_memory("Requested array size exceeds VM limit"); > > Again not obvious this should now be unconditional Same reasoning as for `MemAllocator::Allocation::check_out_of_memory`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1584241473 PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1584261677 PR Review Comment: https://git.openjdk.org/jdk/pull/18925#discussion_r1584262845 From ayang at openjdk.org Tue Apr 30 07:34:22 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 30 Apr 2024 07:34:22 GMT Subject: RFR: 8331285: Deprecate and obsolete OldSize [v2] In-Reply-To: References: Message-ID: > Simple deprecating a jvm flag. Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - merge - review - old-size ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18994/files - new: https://git.openjdk.org/jdk/pull/18994/files/b659f2a5..59741b39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18994&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18994&range=00-01 Stats: 1278 lines in 42 files changed: 581 ins; 342 del; 355 mod Patch: https://git.openjdk.org/jdk/pull/18994.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18994/head:pull/18994 PR: https://git.openjdk.org/jdk/pull/18994 From rehn at openjdk.org Tue Apr 30 08:21:04 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 08:21:04 GMT Subject: RFR: 8331360: RISCV: u32 _partial_subtype_ctr loaded/stored as 64 In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 07:30:21 GMT, Fei Yang wrote: > Good catch. Looks good! Thanks! > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3347: > >> 3345: >> 3346: #ifndef PRODUCT >> 3347: incrementw(ExternalAddress((address)&SharedRuntime::_partial_subtype_ctr)); > > I just checked the x86 and aarch64 counterpart. Seems that aarch64 bears the same issue [1] as it uses `ldr` / `str` which load / store 64-bit data items like here. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L1565 I'll open a separate PR, and maybe fix it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19010#issuecomment-2084681630 PR Review Comment: https://git.openjdk.org/jdk/pull/19010#discussion_r1584334544 From rehn at openjdk.org Tue Apr 30 08:54:05 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 08:54:05 GMT Subject: RFR: 8331360: RISCV: u32 _partial_subtype_ctr loaded/stored as 64 In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 08:18:50 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3347: >> >>> 3345: >>> 3346: #ifndef PRODUCT >>> 3347: incrementw(ExternalAddress((address)&SharedRuntime::_partial_subtype_ctr)); >> >> I just checked the x86 and aarch64 counterpart. Seems that aarch64 bears the same issue [1] as it uses `ldr` / `str` which load / store 64-bit data items like here. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L1565 > > I'll open a separate PR, and maybe fix it. FYI: https://github.com/openjdk/jdk/pull/19011 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19010#discussion_r1584386296 From rehn at openjdk.org Tue Apr 30 08:55:44 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 08:55:44 GMT Subject: RFR: 8331393: AArch64: u32 _partial_subtype_ctr loaded/stored as 64 Message-ID: <3xNeycdTwfJhuy6uEm2uCcXl5NN9Nc3RElC0gVfPYQQ=.a5a2bb51-dad5-41c7-aa7d-ace5628832dd@github.com> Hi, please consider. Let's use incw for these. Untested, hoping GHA checks this :) Thanks, Robbin ------------- Commit messages: - ldrw/strw Changes: https://git.openjdk.org/jdk/pull/19011/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19011&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331393 Stats: 9 lines in 2 files changed: 0 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/19011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19011/head:pull/19011 PR: https://git.openjdk.org/jdk/pull/19011 From azafari at openjdk.org Tue Apr 30 08:58:18 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 08:58:18 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: <6wS8eeO2KoYhRkkDxB4YhWStEfLrU2FRtT8CMwYkI74=.bf05a80b-f10b-417e-ba2d-76a86f0a3122@github.com> On Tue, 30 Apr 2024 05:31:05 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/share/memory/virtualspace.cpp line 57: > >> 55: } >> 56: >> 57: ReservedSpace::ReservedSpace(size_t size, size_t preferred_page_size, MEMFLAGS flag) : _fd_for_heap(-1), _nmt_flag(flag) { > > Small nit: Mixture of styles. As much as I dislike it, current style is to initialize things via dedicated initialize methods. I'd rather stay consistent. > > That said, I would be more than happy for someone to give these classes a once-over and convert them to the more usual style - using initializer lists. Then, we also can make members const that should be const, e.g. _nmt_flags. Not in this PR though. Do you mean to move the initializations on line 57 (and others in the files) to the `initialize` method? > src/hotspot/share/memory/virtualspace.hpp line 46: > >> 44: int _fd_for_heap; >> 45: bool _executable; >> 46: MEMFLAGS _nmt_flag; > > See my remark below. This member, and probably others (e.g. page size and size) could and should probably be const. Food for follow up PRs. The getter method made as `const`. > src/hotspot/share/memory/virtualspace.hpp line 71: > >> 69: public: >> 70: >> 71: MEMFLAGS nmt_flag() { return _nmt_flag; } > > const method Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584392835 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584393529 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584389450 From azafari at openjdk.org Tue Apr 30 08:58:20 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 08:58:20 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: <1BNwYTHgU-eHN44HHfYcnfw3XY_BS43XDnqcgDfNPQo=.afd63b8b-4927-4f89-85b4-35e9794acedd@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <1BNwYTHgU-eHN44HHfYcnfw3XY_BS43XDnqcgDfNPQo=.afd63b8b-4927-4f89-85b4-35e9794acedd@github.com> Message-ID: On Tue, 30 Apr 2024 05:37:12 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > src/hotspot/share/memory/virtualspace.cpp line 613: > >> 611: } >> 612: } >> 613: } > > stray There should be no stray now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584394133 From jsjolen at openjdk.org Tue Apr 30 09:03:19 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 09:03:19 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray [v4] In-Reply-To: References: Message-ID: <9SKtdGSfTCKlM5wkHHt6A3MB61y2wNEe_urH7-oKTcI=.11d631a1-d645-4c10-ba51-5c68b3e227b4@github.com> > Hi, > > This PR introduces the possibility of using references more often when using GrowableArray, where as previously this was only possible when using the `at()` method. This lets us avoid copying and redundant method calls and makes the API more streamlined. After the patch, we can use `at_grow` just like `at` works. The same goes for `top`, `first`, and `last`. > > > Some example code: > ```c++ > // Before this patch this worked: > GrowableArray arr(8,8,-1); // Pre-fill with 8 -1s > int& x = arr.at(7); > if (x == -1) { > x = 2; > } > assert(arr.at(7) == 2, "this holds"); > // but this was forbidden > int& x = arr.at_grow(9, -1); // Compilation error! at_grow returns E, not E& > // so we had to do > int x = arr.at_grow(9, -1); > if (x == -1) { > arr.at_put(9, 2); > } > > > Thanks. Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Equivalent const variant for adr_at ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18975/files - new: https://git.openjdk.org/jdk/pull/18975/files/b9431198..8d9607ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18975&range=02-03 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18975.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18975/head:pull/18975 PR: https://git.openjdk.org/jdk/pull/18975 From jsjolen at openjdk.org Tue Apr 30 09:03:20 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 09:03:20 GMT Subject: RFR: 8331193: Return references when possible in GrowableArray [v3] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 22:58:02 GMT, Kim Barrett wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Small mistakes - FIXED! > > src/hotspot/share/utilities/growableArray.hpp line 153: > >> 151: E* adr_at(int i) const { >> 152: assert(0 <= i && i < _len, "illegal index %d for length %d", i, _len); >> 153: return &_data[i]; > > (GitHub won't let me put comment on the `adr_at` signature.) > > I think there should similarly be const and non-const adr_at, returning pointer to const and non-const respectively. Done! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18975#discussion_r1584401093 From azafari at openjdk.org Tue Apr 30 09:04:17 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 09:04:17 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: <3_SnilNMBOpzwdkyaOW4w4QyMfqIjAlR99N0dTBsksc=.d2c0f5dd-4e16-4fd6-9c04-eb8e6ae395ba@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <3_SnilNMBOpzwdkyaOW4w4QyMfqIjAlR99N0dTBsksc=.d2c0f5dd-4e16-4fd6-9c04-eb8e6ae395ba@github.com> Message-ID: On Tue, 30 Apr 2024 05:40:15 GMT, Thomas Stuefe wrote: >> Done. > > Where? I still see mtMetaspace. It should be seen now. >> Done. > > I still see mtMetaspace. It should be seen now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584403814 PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584402987 From azafari at openjdk.org Tue Apr 30 09:04:19 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 09:04:19 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: On Tue, 30 Apr 2024 05:38:53 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/share/memory/virtualspace.cpp line 623: > >> 621: } >> 622: // _nmt_flag is used internally by initialize_compressed_heap >> 623: _nmt_flag = mtJavaHeap; > > Nit, we use a mixture of directly accessing _nmt_flag and accessing it via getter. Hotspot seems to prefer getters/setters. Can we use setters here? The flag is not set/changed in other classes, so there is no need to have a `public set_nmt_flag()` member for it. All the changes to the flag can be done internally using the member directly. P.S.: There was already a setter but removed after a review comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584399496 From shade at openjdk.org Tue Apr 30 09:16:19 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Apr 2024 09:16:19 GMT Subject: RFR: 8328934: Assert that ABS input and output are legal [v6] In-Reply-To: <9r6p7oHNH_9hg_jzOxFh2lKDDS8a1hwTRIIDjfCWOeU=.a30a43cb-426b-4d66-93d6-bb55ab2a8445@github.com> References: <9r6p7oHNH_9hg_jzOxFh2lKDDS8a1hwTRIIDjfCWOeU=.a30a43cb-426b-4d66-93d6-bb55ab2a8445@github.com> Message-ID: On Mon, 29 Apr 2024 08:18:51 GMT, Aleksey Shipilev wrote: >> This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. >> >> Additional testing: >> - [x] MacOS AArch64 server fastdebug build passes >> - [x] Linux x86_64 server fastdebug, `all` >> - [x] Linux x86_64 server fastdebug, 100K Fuzzer tests >> - [x] Linux x86_64 server fastdebug, Maven CTW > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8328934-abs-legal > - Also tests > - Drop the other check; dodge UB > - More straightforward > - Richer error reporting > - Only assert integral type arguments > - Need explicit include as well > - Fix Thanks, here goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18751#issuecomment-2084788504 From stuefe at openjdk.org Tue Apr 30 09:16:18 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 09:16:18 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 12:32:39 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > assert device != nullptr in MemoryFileTracker::instance Started looking at Treap. More tomorrow. About the Treap. Looking at this closer, I don't understand why you don't follow the more canonical way of implementing a treap: - Insert: find lexical position for node x, then do tree rotations until the node priority is in line - Delete: find node, and as long as its not a leaf, rotate until it is. Then cut node out. Would that not be easier to understand, and faster too? No need for recursion either. We would not need merge and split at all. All we need is insert and delete, after all, we don't need any bulk operations on nodes. Then, the Treap definitly needs an ASSERT-only verify to check consistency, and that needs to be called periodically. TreapNode: Aesthetical nits: Its a mix right now. You have getters, but then the owning Treap has private access and uses that. I see you expose the TreapNode to outside access. In that case, I would prefer if you would use getters and setters consistently, and remove the friend declaration. Style-wise, the code could be more condensed. File naming: I see we have inconsistencies. We name some files "nmtXXX", some not. All code lives inside "nmt", so the prefix could be superfluous. Just a small nit. BTW, I won't insist on storing callstacks in a linear array. That is just a performace- and memory optimization. We can do this later in a follow up RFE. ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-2030296761 PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2084792287 From shade at openjdk.org Tue Apr 30 09:16:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Apr 2024 09:16:21 GMT Subject: Integrated: 8328934: Assert that ABS input and output are legal In-Reply-To: References: Message-ID: On Fri, 12 Apr 2024 08:31:48 GMT, Aleksey Shipilev wrote: > This should protect us from future accidents around `abs` misuse. We have fixed a few separately. I plan to use this as the litmus test in update releases to detect missing backports for actual fixes. I am running more tests to see if we have any other sightings in current codebase, but this can be reviewed for sanity meanwhile. > > Additional testing: > - [x] MacOS AArch64 server fastdebug build passes > - [x] Linux x86_64 server fastdebug, `all` > - [x] Linux x86_64 server fastdebug, 100K Fuzzer tests > - [x] Linux x86_64 server fastdebug, Maven CTW This pull request has now been integrated. Changeset: cff841f1 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/cff841f1de41c911ec1b642b998c074e13e75554 Stats: 128 lines in 2 files changed: 127 ins; 0 del; 1 mod 8328934: Assert that ABS input and output are legal Reviewed-by: aph, dlong ------------- PR: https://git.openjdk.org/jdk/pull/18751 From stuefe at openjdk.org Tue Apr 30 09:16:20 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 09:16:20 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <7u3imUh6-qb_wLdyZ4mn5SfnEOkxyFEQ20O0fb6WJj0=.3179edcb-0340-4d50-a674-c18128cc2e2f@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <7u3imUh6-qb_wLdyZ4mn5SfnEOkxyFEQ20O0fb6WJj0=.3179edcb-0340-4d50-a674-c18128cc2e2f@github.com> Message-ID: <3z6o8urlRN3qEViyH6CMdXYByP0LR8mMKBYVe9_xKGI=.db9bb8d2-d4c1-4b23-9667-c0a9b7d7b94f@github.com> On Fri, 22 Mar 2024 16:37:43 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 30: >> >>> 28: #include "utilities/growableArray.hpp" >>> 29: #include "utilities/nativeCallStack.hpp" >>> 30: >> >> Here, I would love it if we had smaller than pointer-sized callstack IDs. >> >> The simplest way would be to copy callstacks into a growable area, and use their numerical indices as ID. We can get by with 16 bits, or 32 bits if we think 64K callstacks are not enough (they should be enough). >> >> Those can then be combined very efficiently with MEMFLAG and VMAState in the VMATree nodes. >> >> The hashmap for reverse id lookup could be a standard hashmap of key=callstack, value=ID. >> >> As a future possible improvement, that could then replace the MallocSiteTable which does almost the same thing. (only have to avoid using malloc then, because of recursivities). > > I still don't get how a growable array is supposed to work for open addressing while having the indices staying immutable :-). > > How does this work: > > ```c++ > HTable ht; // Open-addressed GrowableArray with linear probing > Index oldidx = ht.put(4); > // A bunch of puts, leading to a resize of the array > Index newidx = ht.put(4); > assert(oldidx == newidx, "how is this ensured?"); > > > How does this work? Huh? Should the indexes not be stable across resize? Unless you shrink, which we would not need to do? The base address of the array may change, but the relative order of the items in it hopefully not. >> src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 84: >> >>> 82: return *_stack; >>> 83: } >>> 84: }; >> >> What is the point of the StackIndex class? > > The `StackIndex` used to be a 4 byte encoding which required the `NativeCallStackStorage` to be able to be dereferenced. This method should at the very least be deleted and replaced with `NativeCallStackStorage::get`. Not sure I follow you. But if you follow my proposal above, StackIndex could be just a 4byte index: On store, place callstack into array. Possibly resize array, if full. Return index. index is now uniquely identifying the stack. If you want to keep the linked list - after all, this is just a performance- and memory-optimization - why not just return a const NativeCallStack* instead of an index? >> src/hotspot/share/nmt/vmatree.hpp line 46: >> >>> 44: >>> 45: // Each node has some stack and a flag associated with it. >>> 46: struct Metadata { >> >> all members const? > > Can't be const unless we want merge to be a function. Okay! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584309014 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584312388 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584190386 From stuefe at openjdk.org Tue Apr 30 09:16:21 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 09:16:21 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Wed, 17 Apr 2024 08:22:31 GMT, Johan Sj?len wrote: > Sure, the allocator isn't responsible for seeding the nodes. It was simply convenient that the class which does allocation also happens to hold the seed, so I could have them do both together. > > I came to the opposite conclusion that you did: I didn't want to have a global shared state seed as that makes me have to think about the behavior of the RNG in the presence of multiple threads and instances of the treap. > Yeah sure, but now you get collisions if you merge two trees together that both have had nodes added before the merge. Because both Treaps will have followed the same rng sequence - after all, AFAICS your seed is always the same. Either use a global seed, or initialize each Treap seed randomly. The former needs cas, the latter needs an os::random call on each constructor. Pick your poison :) I also think you do not need the seed constructor argument if you do that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584336323 From stuefe at openjdk.org Tue Apr 30 09:16:23 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 09:16:23 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: On Mon, 29 Apr 2024 19:34:41 GMT, Gerard Ziemski wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> assert device != nullptr in MemoryFileTracker::instance > > src/hotspot/share/nmt/vmatree.cpp line 29: > >> 27: #include "utilities/growableArray.hpp" >> 28: >> 29: VMATree::SummaryDiff VMATree::register_mapping(size_t A, size_t B, StateType state, > > Does `VMATree` stand for "Virtual Memory Allocation Tree"? > > We have some long name already in nmt, ex: `NativeCallStackStorage`, `MemoryFileTracker`, `MallocSiteHashtableEntry`, `MemSummaryDiffReporter`. Can we then name it: `VirtualMemAllocationTree` or `VirtualMemAllocTree` or `VirtMemAllocTree` ? @gerard-ziemski I originally chose "VMA" because its a clearly defined acronym in linux and has close pendants in other OSes as well. I like its brevity. If we really want to rename it, I would name it something like `IntervalTree`. Since that's what it really is - managing sets of numeric intervals with attributes. But IMHO VMATree is fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584195115 From jwaters at openjdk.org Tue Apr 30 09:17:05 2024 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 30 Apr 2024 09:17:05 GMT Subject: RFR: 8331352: error: template-id not allowed for constructor/destructor in C++20 In-Reply-To: References: Message-ID: <1ZltzGMdx6rjCX1VNnYGFbYCi6YfskRwK8p_Rn0Hnek=.97e6c9b7-fbdf-4008-b624-cc34cd1e4a4d@github.com> On Tue, 30 Apr 2024 02:01:01 GMT, Jan Kratochvil wrote: > When compiling trunk (819f3d6fc70ff6fe54ac5f9033c17c3dd4326aa5 2024-04-29) by gcc-14.0.1-0.15.fc40.x86_64 there are many errors: > > In file included from src/hotspot/share/memory/allocation.hpp:30, > from src/hotspot/share/ci/ciBaseObject.hpp:29, > from src/hotspot/share/ci/ciMetadata.hpp:28, > from src/hotspot/share/ci/ciType.hpp:28, > from src/hotspot/share/ci/ciKlass.hpp:28, > from src/hotspot/share/ci/ciArrayKlass.hpp:28, > from src/hotspot/share/ci/ciArray.hpp:28, > from src/hotspot/share/ci/compilerInterface.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.cpp:25: > src/hotspot/share/utilities/linkedlist.hpp:85:15: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > src/hotspot/share/utilities/linkedlist.hpp:85:15: note: remove the ?< >? > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > > In file included from src/hotspot/share/gc/z/zGranuleMap.inline.hpp:30, > from src/hotspot/share/gc/z/zForwardingTable.inline.hpp:32, > from src/hotspot/share/gc/z/zHeap.inline.hpp:30, > from src/hotspot/share/gc/z/zGeneration.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrier.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrierSet.inline.hpp:31, > from src/hotspot/share/gc/shared/barrierSetConfig.inline.hpp:44, > from src/hotspot/share/oops/access.inline.hpp:31, > from src/hotspot/share/memory/iterator.inline.hpp:32, > from src/hotspot/share/oops/oop.inline.hpp:31, > from src/hotspot/share/compiler/abstractDisassembler.cpp:32: > src/hotspot/share/gc/z/zArray.inline.hpp:99:21: error: template-id not allowed f... Seems weird that we're facing C++20 issues when HotSpot is only on C++14. This seems like it should be in the disabled warnings list of HotSpot for erroneous warnings that gcc is giving us, just my 2 cents ------------- PR Comment: https://git.openjdk.org/jdk/pull/19009#issuecomment-2084792626 From azafari at openjdk.org Tue Apr 30 09:18:13 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 09:18:13 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: On Tue, 30 Apr 2024 05:46:45 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> alignment in coding style changed. > > src/hotspot/share/cds/metaspaceShared.cpp line 1322: > >> 1320: os::vm_page_size(), mtClassShared, (char*)base_address); >> 1321: class_space_rs = ReservedSpace(class_space_size, class_space_alignment, >> 1322: os::vm_page_size(), mtClass, (char*)ccs_base); > > Note that here, we place two spaces atop of a region that has been previously mapped with mtClass (see e.g. src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp). I assume this is not a problem? It should not be a problem. This PR does not change the functionalities at all. Only the MEMFLAGS is passed down to be given to `MemTracker` API. If the above code worked before this PR, it should still work now. For NMT point of view, reserving `mtClassShared` and `mtJavaHeap` regions are accepted to overlap with previously reserved regions. Ans, if a whole region is re-reserved but with a different flag, it is also acceptable and just the accountings are moved from former flag to the new one. I trust in that all the NMT tests passed where checked these cases. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584424486 From azafari at openjdk.org Tue Apr 30 09:30:31 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 09:30:31 GMT Subject: RFR: 8330076: [NMT] add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v14] In-Reply-To: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: > `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. > The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. > When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. > > Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: comments applied. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18745/files - new: https://git.openjdk.org/jdk/pull/18745/files/fa350261..72467f68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18745&range=12-13 Stats: 36 lines in 8 files changed: 0 ins; 3 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/18745.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18745/head:pull/18745 PR: https://git.openjdk.org/jdk/pull/18745 From rehn at openjdk.org Tue Apr 30 09:32:14 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 09:32:14 GMT Subject: RFR: 8331399: RISC-V: Don't us mv instead of la Message-ID: Hi please consider, It makes no sense to use mv instead of la. It doesn't follow the standard mnemonics and it confusing when people use mv when they really mean la. la will do the reloc with movptr in this case, so the code is the same. Testing t1. Thanks, Robbin ------------- Commit messages: - Use la() Changes: https://git.openjdk.org/jdk/pull/19014/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19014&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8331399 Stats: 9 lines in 2 files changed: 0 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/19014.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19014/head:pull/19014 PR: https://git.openjdk.org/jdk/pull/19014 From aph at openjdk.org Tue Apr 30 09:34:04 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 30 Apr 2024 09:34:04 GMT Subject: RFR: 8331393: AArch64: u32 _partial_subtype_ctr loaded/stored as 64 In-Reply-To: <3xNeycdTwfJhuy6uEm2uCcXl5NN9Nc3RElC0gVfPYQQ=.a5a2bb51-dad5-41c7-aa7d-ace5628832dd@github.com> References: <3xNeycdTwfJhuy6uEm2uCcXl5NN9Nc3RElC0gVfPYQQ=.a5a2bb51-dad5-41c7-aa7d-ace5628832dd@github.com> Message-ID: On Tue, 30 Apr 2024 08:51:03 GMT, Robbin Ehn wrote: > Hi, please consider. > > Let's use incw for these. > > Untested, hoping GHA checks this :) > > Thanks, Robbin Looks right. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19011#pullrequestreview-2030774461 From jsjolen at openjdk.org Tue Apr 30 09:35:16 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 09:35:16 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: <8tX-E6rhvM3r0MhHkAmoCaxzyUrQ6ohmV8UDYMdokms=.77f5daee-f371-4ab8-ad98-337cb4fb4111@github.com> On Tue, 30 Apr 2024 09:11:30 GMT, Thomas Stuefe wrote: > Started looking at Treap. More tomorrow. > > About the Treap. Looking at this closer, I don't understand why you don't follow the more canonical way of implementing a treap: > > * Insert: find lexical position for node x, then do tree rotations until the node priority is in line > > * Delete: find node, and as long as its not a leaf, rotate until it is. Then cut node out. > > > Would that not be easier to understand, and faster too? No need for recursion either. We would not need merge and split at all. All we need is insert and delete, after all, we don't need any bulk operations on nodes. There seems to be two common ways: - The merge/split thing that I did https://cp-algorithms.com/data_structures/treap.html https://www.geeksforgeeks.org/implementation-of-search-insert-and-delete-in-treap/ - The rotation implementation: https://yourbasic.org/algorithms/treap/ https://www.drdobbs.com/windows/treaps-in-java/184410231 https://opendatastructures.org/ods-java/7_2_Treap_Randomized_Binary.html Both of these are typically implemented in a recursive manner, though the last link is iterative. I found the merge/split to be easier to understand and since the recursion is bounded on the order of `log(n)` not doing it iteratively is fine by me. If it's a deal breaker, I'll rewrite it iteratively using rotations. > > Then, the Treap definitly needs an ASSERT-only verify to check consistency, and that needs to be called periodically. Okay, so this is irrelevant of how the treap is implemented? I guess it'd check for: - Non-degeneration of the depth of the tree (approximately log n) - Uniqueness of keys - Anything else?? > > TreapNode: > > Aesthetical nits: Its a mix right now. You have getters, but then the owning Treap has private access and uses that. I see you expose the TreapNode to outside access. In that case, I would prefer if you would use getters and setters consistently, and remove the friend declaration. Style-wise, the code could be more condensed. > I'll look into that. > File naming: I see we have inconsistencies. We name some files "nmtXXX", some not. All code lives inside "nmt", so the prefix could be superfluous. Just a small nit. Same here. I think that file names have to be unique across the whole of Hotspot (adlc and libadt being exceptions IIRC), so prepending "nmt" to datastructures seemed like a good way to ensure that. Really, the memory file tracker should still not have the `nmt` prepended regardless. Addressing the rest of your comments separately. Thanks for having another look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2084828969 From jkern at openjdk.org Tue Apr 30 09:39:09 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 30 Apr 2024 09:39:09 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Mon, 29 Apr 2024 16:17:13 GMT, Joachim Kern wrote: >> For the impatient, I suggest adopting mechanism 2, i.e. unconditionally >> include in globalDefinitions_gcc.hpp. >> >> We can't include in shared code, and there is a use in shared code >> (in the relatively recently added JavaThread::pretouch_stack). >> >> When I questioned whether we needed to include at all, I referred >> to a Linux man page I'd found on the internet (the same page mdoerr linked >> to), which says (in part) >> >> "By default, modern compilers automatically translate all uses of alloca() >> into the built-in ..." >> >> Apparently I should have kept digging, because it seems that page is >> old/incorrect. A seemingly more recent Linux man page describes a different >> way of handling it that is closer to what we're seeing, but still not quite >> correct. >> >> glibc's includes if __USE_MISC is defined. >> One of the ways __USE_MISC can become defined is if _GNU_SOURCE is defined, >> and we define that for both gcc and clang toolchains. >> >> We include in globalDefinitions_gcc.hpp. So when building with gcc, >> globalDefinitions.hpp implicitly includes . >> >> The glibc definition of alloca is >> >> #ifdef __GNUC__ >> # define alloca(size) __builtin_alloca (size) >> #endif /* GCC. */ >> >> So that explains why we don't need any explicit include of when >> building with gcc. I expect there's something similar going on with Visual >> Studio and Xcode/clang. But apparently not with Open XLC clang. > > On AIX `stdlib.h` also would define `alloca`, if `__STRICT_ANSI__` wouldn't be set. > > > 780 #if !defined(__xlC__) || defined(__ibmxl__) || defined(__cplusplus) > 781 #if defined(__IBMCPP__) && !defined(__ibmxl__) > 782 extern "builtin" char *__alloca (size_t); > 783 # define alloca __alloca > 784 #elif defined(__GNUC__) && !defined(__STRICT_ANSI__) > 785 #undef alloca > 786 #define alloca(size) __builtin_alloca (size) > 787 #endif > > > A small plain Testprogramm not using all of the flags we used in jdk build, does not set `__STRICT_ANSI__` and then `alloca` is defined correct. The compiler flag introducing __STRICT_ANSI__ is -std=c++14. If I omit this explicit compiler flag the default is used, which is also c++14. But the default does not set __STRICT_ANSI__ but 2 other defines. I will try a build without -std=c++14 and if this works, we have a solution. Nevertheless i will interrogate IBM what the hell this behavior should be. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584461665 From rehn at openjdk.org Tue Apr 30 09:44:04 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 09:44:04 GMT Subject: RFR: 8331393: AArch64: u32 _partial_subtype_ctr loaded/stored as 64 In-Reply-To: References: <3xNeycdTwfJhuy6uEm2uCcXl5NN9Nc3RElC0gVfPYQQ=.a5a2bb51-dad5-41c7-aa7d-ace5628832dd@github.com> Message-ID: <_dJdxCMxMWUzoQ-ZKe-4ppA6c_BXKwGOk3rr915Ja0o=.363a8d4a-3b72-4718-906a-1e08314e72c7@github.com> On Tue, 30 Apr 2024 09:31:27 GMT, Andrew Haley wrote: > Looks right. Thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19011#issuecomment-2084844771 From fyang at openjdk.org Tue Apr 30 09:50:07 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 30 Apr 2024 09:50:07 GMT Subject: RFR: 8331393: AArch64: u32 _partial_subtype_ctr loaded/stored as 64 In-Reply-To: <3xNeycdTwfJhuy6uEm2uCcXl5NN9Nc3RElC0gVfPYQQ=.a5a2bb51-dad5-41c7-aa7d-ace5628832dd@github.com> References: <3xNeycdTwfJhuy6uEm2uCcXl5NN9Nc3RElC0gVfPYQQ=.a5a2bb51-dad5-41c7-aa7d-ace5628832dd@github.com> Message-ID: On Tue, 30 Apr 2024 08:51:03 GMT, Robbin Ehn wrote: > Hi, please consider. > > Let's use incw for these. > > Untested, hoping GHA checks this :) > > Thanks, Robbin Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/19011#pullrequestreview-2030807221 From azafari at openjdk.org Tue Apr 30 09:50:18 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 09:50:18 GMT Subject: RFR: 8330076: NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v5] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Sat, 13 Apr 2024 05:38:11 GMT, Thomas Stuefe wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> mtCode and mtMetaspace were missed from System Dump map > > Just a thought: one (manual) test I would do would be that several JVMs run with the same conditions (I would do at least one with -Xmx=Xms and AlwaysPreTouch) accumulate the same NMT numbers, current, and peak. Just to make sure we use the same flags before and after. Thank you @tstuefe for your review. Some changes were missed and/or not pushed. So, you should be b able to see them now. The comment on removing MEMFLAGS from params of the `uncommit_memory` family of API is not applied since existence of the flag makes the VMATree operations more efficient. I gather the following future PRs that you mentioned, to have individual threads of comments: - unify style of initialization in virtualspace.?pp code. - check if any member can be `const` - re-designing ReservedSpace ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2084855905 From jsjolen at openjdk.org Tue Apr 30 10:02:30 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 10:02:30 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v55] In-Reply-To: References: Message-ID: <0ERWKkGihBXyJ9MJv6QI2AeP12Bg0xCG1WHhHxE4H8M=.2281e940-45ab-4f33-b6c8-f23106f7feb3@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Constify Metadata - Another check ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/ed01d703..dc987443 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=54 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=53-54 Stats: 3 lines in 2 files changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Apr 30 10:06:10 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 10:06:10 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3z6o8urlRN3qEViyH6CMdXYByP0LR8mMKBYVe9_xKGI=.db9bb8d2-d4c1-4b23-9667-c0a9b7d7b94f@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <7u3imUh6-qb_wLdyZ4mn5SfnEOkxyFEQ20O0fb6WJj0=.3179edcb-0340-4d50-a674-c18128cc2e2f@github.com> <3z6o8urlRN3qEViyH6CMdXYByP0LR8mMKBYVe9_xKGI=.db9bb8d2-d4c1-4b23-9667-c0a9b7d7b94f@github.com> Message-ID: On Tue, 30 Apr 2024 07:59:09 GMT, Thomas Stuefe wrote: > Should the indexes not be stable across resize? **No.** The hash is determined as: `int place_to_put_element = hash_of(the_thing) % size_of_array;` The `size_of_array` will change, so when probing for/inserting the same NCS after a resize a new index may be used. Meaning, we will have duplicate entries. If we're OK with this, then that's fine. It means that equality checking will require dereferencing the index and doing the full NCS comparison. ```c++ GA ht(2); // Size 2 int oldidx = hash(4) % ht.size(); // oldidx == 0 ht.put(oldidx, 4); // Out of room, resize ht.grow(4); // Now imagine you insert oldidx into some treap node's metadata // Now we're adding the same int, 4, again but get a different index int newidx = hash(4) % ht.size(); // newidx == 2 // Now what? >> Can't be const unless we want merge to be a function. > > Okay! Merge is gone, and they are const now :-). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584467403 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584494946 From jsjolen at openjdk.org Tue Apr 30 10:06:11 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 10:06:11 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> On Tue, 30 Apr 2024 06:26:25 GMT, Thomas Stuefe wrote: >If we really want to rename it, I would name it something like IntervalTree. Since that's what it really is - managing sets of numeric intervals with attributes. But IMHO VMATree is fine. I prefer `VMATree` because a traditional interval tree as per Wikipedia doesn't perform any sort of merging, it's "just" a set of intervals. If we do expand VMA it should naively be `VirtualMemoryAreaTree`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584494402 From jsjolen at openjdk.org Tue Apr 30 10:06:10 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 10:06:10 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Tue, 30 Apr 2024 08:19:47 GMT, Thomas Stuefe wrote: > Yeah sure, but now you get collisions if you merge two trees together that both have had nodes added before the merge. Because both Treaps will have followed the same rng sequence - after all, AFAICS your seed is always the same. Alright, yeah, that's a problem if we make that a supported API. `os::random` for each `Treap` constructor and then saving that as the initial seed seems reasonable to me and like a clear improvement. >I also think you do not need the seed constructor argument if you do that. That's true, it might be nice to have a seed constructor argument if you want reproducible tests for example. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584488347 From ihse at openjdk.org Tue Apr 30 10:22:12 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 30 Apr 2024 10:22:12 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: <9YS0_M4NEVmNW42dPYHTfuYarIWGCxKdXQqFeWTt4hI=.a427f219-a7db-472f-8e20-722d3bd516d6@github.com> On Tue, 30 Apr 2024 09:36:52 GMT, Joachim Kern wrote: >> On AIX `stdlib.h` also would define `alloca`, if `__STRICT_ANSI__` wouldn't be set. >> >> >> 780 #if !defined(__xlC__) || defined(__ibmxl__) || defined(__cplusplus) >> 781 #if defined(__IBMCPP__) && !defined(__ibmxl__) >> 782 extern "builtin" char *__alloca (size_t); >> 783 # define alloca __alloca >> 784 #elif defined(__GNUC__) && !defined(__STRICT_ANSI__) >> 785 #undef alloca >> 786 #define alloca(size) __builtin_alloca (size) >> 787 #endif >> >> >> A small plain Testprogramm not using all of the flags we used in jdk build, does not set `__STRICT_ANSI__` and then `alloca` is defined correct. > > The compiler flag introducing __STRICT_ANSI__ is -std=c++14. If I omit this explicit compiler flag the default is used, which is also c++14. But the default does not set __STRICT_ANSI__ but 2 other defines. I will try a build without -std=c++14 and if this works, we have a solution. Nevertheless i will interrogate IBM what the hell this behavior should be. I don't think leaving out `-std=c++14` for AIX is a good solution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584529538 From jsjolen at openjdk.org Tue Apr 30 10:27:14 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 30 Apr 2024 10:27:14 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: Message-ID: <8LHMi4zLTcPIXlU1x3vPg0tPypmBm54eaoElo-kc1vw=.d896ba7c-8a0f-4695-abd3-1e52dbfa5d5f@github.com> On Mon, 29 Apr 2024 20:34:24 GMT, Gerard Ziemski wrote: >I'm probably missing something here Yes :P. Consider this pseudo-code: Tree := MakeTree([0 Reserved mtTest 1024), [1024 Committed mtNMT 2048)) Diff := Tree.Reserve(0, 2048, mtCompiler); Tree is now [0 Reserved mtCompiler 2048) What is the Diff? Answer: Diff is {mtCompiler : {Reserved: +2048, Committed: 0}, mtNMT : {Reserved: -1024, Committed: -1024}, mtTest : {Reserved: -1024, Committed: 0 } } So, the diff indicates every flag that has changed and by how much. Now, in this case we know we only need to care about the reserved diff (we only `reserve` when allocating memory in `MemoryFileTracker`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584536356 From rehn at openjdk.org Tue Apr 30 10:36:05 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 10:36:05 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: References: <1UZeWIQJIEYbPetxWPlhQffyAy4gWXvNiV79i4_3pMQ=.86fb3068-940b-49ea-a2ea-b84a865d4cca@github.com> <0gMQgeYKyAzms64-hBIrltqUSfetu3Kczwr7IwLmF18=.8f583ac0-afff-4f1b-985f-a688cd898ae3@github.com> Message-ID: On Mon, 29 Apr 2024 06:30:18 GMT, Robbin Ehn wrote: >> Hi, Let me try to understand what you mean. Are we going to remove the `relocate` for non-code-cache call at [1] and further improve the `movptr` at [2] making use of `la`? So no need for `call` then as they could be replaced with `rt_call`? This sounds interesting to me :- ) >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L5031 >> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L5033 > > I have not looked at it in detail. > > As mnemonic for **call** is _auipc_ + _jalr_, in hotspot `la()` + `jalr()`. > So sites using call() for non-code-cache was changed to rt_call(), which gets us the same result as the old call(). > > Hence this patch 'tries' to keep the generate assembly the same. Make sense? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1584548549 From stuefe at openjdk.org Tue Apr 30 11:32:13 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 11:32:13 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <7u3imUh6-qb_wLdyZ4mn5SfnEOkxyFEQ20O0fb6WJj0=.3179edcb-0340-4d50-a674-c18128cc2e2f@github.com> <3z6o8urlRN3qEViyH6CMdXYByP0LR8mMKBYVe9_xKGI=.db9bb8d2-d4c1-4b23-9667-c0a9b7d7b94f@github.com> Message-ID: On Tue, 30 Apr 2024 09:40:40 GMT, Johan Sj?len wrote: >> Huh? Should the indexes not be stable across resize? Unless you shrink, which we would not need to do? >> >> The base address of the array may change, but the relative order of the items in it hopefully not. > >> Should the indexes not be stable across resize? > > **No.** The hash is determined as: `int place_to_put_element = hash_of(the_thing) % size_of_array;` > > The `size_of_array` will change, so when probing for/inserting the same NCS after a resize a new index may be used. Meaning, we will have duplicate entries. If we're OK with this, then that's fine. It means that equality checking will require dereferencing the index and doing the full NCS comparison. > > ```c++ > GA ht(2); // Size 2 > int oldidx = hash(4) % ht.size(); // oldidx == 0 > ht.put(oldidx, 4); > // Out of room, resize > ht.grow(4); > // Now imagine you insert oldidx into some treap node's metadata > // Now we're adding the same int, 4, again but get a different index > int newidx = hash(4) % ht.size(); // newidx == 2 > // Now what? Ah, I get the confusion. This is not what I meant. What I mean was: At the moment you malloc space for NativeCallStack, then keep NativeCallStack* in the hash map. NativeCallStack* now uniquely identifies your stack. What I meant is to place NativeCallStack in a growable array. Now, you have a 32-bit or even a 16-bit index into that array. That index uniquely identifies the stack. You keep that index the hashmap. The hashmap does not change. Hashmap storage has nothing to do with that array. This is not the bucket array. Basically, you replace the malloc for the NativeCallStack with a placement-new in a new growable array. The rest stays the same. But now, you have a 32-bit or even 16-bit index, and that is smaller than a native pointer, which makes it possible to encode the stack information in a tree node much more succinctively. This makes it possible to encode the whole tree node metainfo very comfortably in a single 64-bit value. You can even get both in- and out-state of the VMATree into a single 64-bit value like this: bits 0-7 MEMFLAGS in bits 8-16 State in bits 16-31 callstack index in bits 32-39 MEMFLAGS out bits 40-47 State out bits 48-63 callstack index out ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1584629296 From fyang at openjdk.org Tue Apr 30 11:41:06 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 30 Apr 2024 11:41:06 GMT Subject: RFR: 8331399: RISC-V: Don't us mv instead of la In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 09:27:09 GMT, Robbin Ehn wrote: > Hi please consider, > > It makes no sense to use mv instead of la. > It doesn't follow the standard mnemonics and it confusing when people use mv when they really mean la. > > la will do the reloc with movptr in this case, so the code is the same. > > Testing t1. > > Thanks, Robbin Yes. that looks more reasonable. Thanks for the cleanup. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19014#pullrequestreview-2031043738 From stuefe at openjdk.org Tue Apr 30 12:06:15 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 12:06:15 GMT Subject: RFR: 8330076: NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: <6wS8eeO2KoYhRkkDxB4YhWStEfLrU2FRtT8CMwYkI74=.bf05a80b-f10b-417e-ba2d-76a86f0a3122@github.com> References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <6wS8eeO2KoYhRkkDxB4YhWStEfLrU2FRtT8CMwYkI74=.bf05a80b-f10b-417e-ba2d-76a86f0a3122@github.com> Message-ID: On Tue, 30 Apr 2024 08:55:31 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/virtualspace.cpp line 57: >> >>> 55: } >>> 56: >>> 57: ReservedSpace::ReservedSpace(size_t size, size_t preferred_page_size, MEMFLAGS flag) : _fd_for_heap(-1), _nmt_flag(flag) { >> >> Small nit: Mixture of styles. As much as I dislike it, current style is to initialize things via dedicated initialize methods. I'd rather stay consistent. >> >> That said, I would be more than happy for someone to give these classes a once-over and convert them to the more usual style - using initializer lists. Then, we also can make members const that should be const, e.g. _nmt_flags. Not in this PR though. > > Do you mean to move the initializations on line 57 (and others in the files) to the `initialize` method? Or, just remove it. The initialise function already initialises the member, right? But it's really a small nit. I like your variant more if it were uniquely applied. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584684694 From rehn at openjdk.org Tue Apr 30 12:27:05 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 30 Apr 2024 12:27:05 GMT Subject: RFR: 8331399: RISC-V: Don't us mv instead of la In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 11:38:37 GMT, Fei Yang wrote: > Yes. that looks more reasonable. Thanks for the cleanup. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/19014#issuecomment-2085193190 From jkern at openjdk.org Tue Apr 30 12:36:16 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 30 Apr 2024 12:36:16 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <9YS0_M4NEVmNW42dPYHTfuYarIWGCxKdXQqFeWTt4hI=.a427f219-a7db-472f-8e20-722d3bd516d6@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> <9YS0_M4NEVmNW42dPYHTfuYarIWGCxKdXQqFeWTt4hI=.a427f219-a7db-472f-8e20-722d3bd516d6@github.com> Message-ID: On Tue, 30 Apr 2024 10:19:30 GMT, Magnus Ihse Bursie wrote: >> The compiler flag introducing __STRICT_ANSI__ is -std=c++14. If I omit this explicit compiler flag the default is used, which is also c++14. But the default does not set __STRICT_ANSI__ but 2 other defines. I will try a build without -std=c++14 and if this works, we have a solution. Nevertheless i will interrogate IBM what the hell this behavior should be. > > I don't think leaving out `-std=c++14` for AIX is a good solution. I got it. And what about simply disabling the `__STRICT_ANSI__` with `CFLAGS_OS_DEF_JVM="-DAIX -D_LARGE_FILES -U__STRICT_ANSI__"` in flags-cflags.m4 for AIX. This worked too. The build is fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584725053 From ihse at openjdk.org Tue Apr 30 12:49:11 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 30 Apr 2024 12:49:11 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> <9YS0_M4NEVmNW42dPYHTfuYarIWGCxKdXQqFeWTt4hI=.a427f219-a7db-472f-8e20-722d3bd516d6@github.com> Message-ID: <6HAzyjAGGGHoyHiS_34GvALrggfbGsbAF6IRJlx5WTI=.8f36bfdd-3708-4101-8539-ccfe53cfb6e9@github.com> On Tue, 30 Apr 2024 12:33:19 GMT, Joachim Kern wrote: >> I don't think leaving out `-std=c++14` for AIX is a good solution. > > I got it. And what about simply disabling the `__STRICT_ANSI__` with > `CFLAGS_OS_DEF_JVM="-DAIX -D_LARGE_FILES -U__STRICT_ANSI__"` in flags-cflags.m4 for AIX. This worked too. The build is fine. So what you are saing is basically replacing CFLAGS_OS_DEF_JVM="-DAIX -Dalloca'(size)'=__builtin_alloca'(size)' -D_LARGE_FILES" ``` with CFLAGS_OS_DEF_JVM="-DAIX -D_LARGE_FILES -U__STRICT_ANSI__" ``` ? Yeah, that'll work, I guess. "strict ansi" sounds like a problematic thing to have enabled, and that it is added by `-std=c++14` sounds close to a bug in my ears. So a "workaround" where this is disabled seem reasonable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584754261 From pchilanomate at openjdk.org Tue Apr 30 13:11:09 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 30 Apr 2024 13:11:09 GMT Subject: RFR: 8330817: jdk/internal/vm/Continuation/OSRTest.java times out on libgraal In-Reply-To: References: Message-ID: On Mon, 22 Apr 2024 21:55:38 GMT, Patricio Chilano Mateo wrote: > Small test fix to prevent inlining of foo/fooBigFrame. Tested with Graal repo and verified timeout doesn't happen anymore. > > Thanks, > Patricio Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18905#issuecomment-2085291936 From pchilanomate at openjdk.org Tue Apr 30 13:11:09 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 30 Apr 2024 13:11:09 GMT Subject: Integrated: 8330817: jdk/internal/vm/Continuation/OSRTest.java times out on libgraal In-Reply-To: References: Message-ID: <19kdIY0uou_-ckHXIub1gM7wJAttvzbiwI5wGUelMNU=.5d65ae86-5473-4450-8b88-64b136958652@github.com> On Mon, 22 Apr 2024 21:55:38 GMT, Patricio Chilano Mateo wrote: > Small test fix to prevent inlining of foo/fooBigFrame. Tested with Graal repo and verified timeout doesn't happen anymore. > > Thanks, > Patricio This pull request has now been integrated. Changeset: 22a1c617 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/22a1c617dbe771d8f5cea52af0e2a630af34b35b Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod 8330817: jdk/internal/vm/Continuation/OSRTest.java times out on libgraal Reviewed-by: dnsimon, dlong ------------- PR: https://git.openjdk.org/jdk/pull/18905 From stefank at openjdk.org Tue Apr 30 13:35:16 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 30 Apr 2024 13:35:16 GMT Subject: RFR: 8330076: NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> <6wS8eeO2KoYhRkkDxB4YhWStEfLrU2FRtT8CMwYkI74=.bf05a80b-f10b-417e-ba2d-76a86f0a3122@github.com> Message-ID: On Tue, 30 Apr 2024 12:03:47 GMT, Thomas Stuefe wrote: >> Do you mean to move the initializations on line 57 (and others in the files) to the `initialize` method? > > Or, just remove it. The initialise function already initialises the member, right? > > But it's really a small nit. I like your variant more if it were uniquely applied. FWIW, in an earlier comment I also mentioned that we really should take a pass over this class and clean the code in various ways. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584830726 From stefank at openjdk.org Tue Apr 30 13:35:17 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 30 Apr 2024 13:35:17 GMT Subject: RFR: 8330076: NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v7] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> <7TW9a7Vmnz0nIKq83rYx_VN13PXM9_9nD5iSMzGDfNw=.127fd0ff-ee60-40cf-9994-9a1e81bb5b27@github.com> Message-ID: On Tue, 30 Apr 2024 08:59:21 GMT, Afshin Zafari wrote: >> src/hotspot/share/memory/virtualspace.cpp line 623: >> >>> 621: } >>> 622: // _nmt_flag is used internally by initialize_compressed_heap >>> 623: _nmt_flag = mtJavaHeap; >> >> Nit, we use a mixture of directly accessing _nmt_flag and accessing it via getter. Hotspot seems to prefer getters/setters. Can we use setters here? > > The flag is not set/changed in other classes, so there is no need to have a `public set_nmt_flag()` member for it. > All the changes to the flag can be done internally using the member directly. > P.S.: There was already a setter but removed after a review comment. > Hotspot seems to prefer getters/setters. I don't think this is true. Maybe in some places, but I don't think we prefer to use setters/getters from inside classes. Maybe if we want to add some verification code, but otherwise I tend to prefer using the members directly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18745#discussion_r1584827626 From jkern at openjdk.org Tue Apr 30 13:45:13 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 30 Apr 2024 13:45:13 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: <6HAzyjAGGGHoyHiS_34GvALrggfbGsbAF6IRJlx5WTI=.8f36bfdd-3708-4101-8539-ccfe53cfb6e9@github.com> References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> <9YS0_M4NEVmNW42dPYHTfuYarIWGCxKdXQqFeWTt4hI=.a427f219-a7db-472f-8e20-722d3bd516d6@github.com> <6HAzyjAGGGHoyHiS_34GvALrggfbGsbAF6IRJlx5WTI=.8f36bfdd-3708-4101-8539-ccfe53cfb6e9@github.com> Message-ID: On Tue, 30 Apr 2024 12:46:31 GMT, Magnus Ihse Bursie wrote: >> I got it. And what about simply disabling the `__STRICT_ANSI__` with >> `CFLAGS_OS_DEF_JVM="-DAIX -D_LARGE_FILES -U__STRICT_ANSI__"` in flags-cflags.m4 for AIX. This worked too. The build is fine. > > So what you are saing is basically replacing > > CFLAGS_OS_DEF_JVM="-DAIX -Dalloca'(size)'=__builtin_alloca'(size)' -D_LARGE_FILES" > ``` > with > > CFLAGS_OS_DEF_JVM="-DAIX -D_LARGE_FILES -U__STRICT_ANSI__" > ``` > ? > > Yeah, that'll work, I guess. "strict ansi" sounds like a problematic thing to have enabled, and that it is added by `-std=c++14` sounds close to a bug in my ears. So a "workaround" where this is disabled seem reasonable. Yes this would be the replacement. This is our 4th way to fix the issue. Anyone else who would prefer this too? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584847372 From fyang at openjdk.org Tue Apr 30 13:49:05 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 30 Apr 2024 13:49:05 GMT Subject: RFR: 8326306: RISC-V: Re-structure MASM calls and jumps [v2] In-Reply-To: References: <1UZeWIQJIEYbPetxWPlhQffyAy4gWXvNiV79i4_3pMQ=.86fb3068-940b-49ea-a2ea-b84a865d4cca@github.com> <0gMQgeYKyAzms64-hBIrltqUSfetu3Kczwr7IwLmF18=.8f583ac0-afff-4f1b-985f-a688cd898ae3@github.com> Message-ID: <4iLVM5rBRUo43EgY72DPBxJJ3qaHC4Nx_aWBUW9pIM8=.1f7cdee2-15d8-4b0f-b4ac-082f23198d8e@github.com> On Tue, 30 Apr 2024 10:33:45 GMT, Robbin Ehn wrote: >> I have not looked at it in detail. >> >> As mnemonic for **call** is _auipc_ + _jalr_, in hotspot `la()` + `jalr()`. >> So sites using call() for non-code-cache was changed to rt_call(), which gets us the same result as the old call(). >> >> Hence this patch 'tries' to keep the generate assembly the same. > > Make sense? I am still think about the possibility of unifying `call` and `rt_call`. Having both of them could be confusing to me (and new comers I guess). What I am talking about in my previous comment was something like this add-on change: [addon.diff.txt](https://github.com/openjdk/jdk/files/15164874/addon.diff.txt) What do you think? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1584856424 From mdoerr at openjdk.org Tue Apr 30 14:03:13 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 30 Apr 2024 14:03:13 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Thu, 18 Apr 2024 04:26:21 GMT, Kim Barrett wrote: >> I opened https://bugs.openjdk.org/browse/JDK-8330539 so we don't lose track of this, but we can keep the discussion/voting here. > > For the impatient, I suggest adopting mechanism 2, i.e. unconditionally > include in globalDefinitions_gcc.hpp. > > We can't include in shared code, and there is a use in shared code > (in the relatively recently added JavaThread::pretouch_stack). > > When I questioned whether we needed to include at all, I referred > to a Linux man page I'd found on the internet (the same page mdoerr linked > to), which says (in part) > > "By default, modern compilers automatically translate all uses of alloca() > into the built-in ..." > > Apparently I should have kept digging, because it seems that page is > old/incorrect. A seemingly more recent Linux man page describes a different > way of handling it that is closer to what we're seeing, but still not quite > correct. > > glibc's includes if __USE_MISC is defined. > One of the ways __USE_MISC can become defined is if _GNU_SOURCE is defined, > and we define that for both gcc and clang toolchains. > > We include in globalDefinitions_gcc.hpp. So when building with gcc, > globalDefinitions.hpp implicitly includes . > > The glibc definition of alloca is > > #ifdef __GNUC__ > # define alloca(size) __builtin_alloca (size) > #endif /* GCC. */ > > So that explains why we don't need any explicit include of when > building with gcc. I expect there's something similar going on with Visual > Studio and Xcode/clang. But apparently not with Open XLC clang. Ok for me. Let's hear what @kimbarrett thinks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584884372 From stuefe at openjdk.org Tue Apr 30 14:26:15 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 14:26:15 GMT Subject: RFR: 8330076: NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v14] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: On Tue, 30 Apr 2024 09:30:31 GMT, Afshin Zafari wrote: >> `MEMFLAGS flag` is used to hold/show the type of the memory regions in NMT. Each call of NMT API requires a search through the list of memory regions. >> The Hotspot code reserves/commits/uncommits memory regions and later calls explicitly NMT API with a specific memory type (e.g., `mtGC`, `mtJavaHeap`) for that region. Therefore, there are two search in the list of regions per reserve/commit/uncommit operations, one for the operation and another for setting the type of the region. >> When the memory type is passed in during reserve/commit/uncommit operations, NMT can use it and avoid the extra search for setting the memory type. >> >> Tests: tiers1-5 passed on linux-x64, macosx-aarch64 and windows-x64 for debug and non-debug builds. > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > comments applied. Very good, lets ship this. Any remaining concerns, if there are any, we can address in subsequent RFEs. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18745#pullrequestreview-2031528992 From ihse at openjdk.org Tue Apr 30 14:42:13 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 30 Apr 2024 14:42:13 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Tue, 30 Apr 2024 14:00:25 GMT, Martin Doerr wrote: >> For the impatient, I suggest adopting mechanism 2, i.e. unconditionally >> include in globalDefinitions_gcc.hpp. >> >> We can't include in shared code, and there is a use in shared code >> (in the relatively recently added JavaThread::pretouch_stack). >> >> When I questioned whether we needed to include at all, I referred >> to a Linux man page I'd found on the internet (the same page mdoerr linked >> to), which says (in part) >> >> "By default, modern compilers automatically translate all uses of alloca() >> into the built-in ..." >> >> Apparently I should have kept digging, because it seems that page is >> old/incorrect. A seemingly more recent Linux man page describes a different >> way of handling it that is closer to what we're seeing, but still not quite >> correct. >> >> glibc's includes if __USE_MISC is defined. >> One of the ways __USE_MISC can become defined is if _GNU_SOURCE is defined, >> and we define that for both gcc and clang toolchains. >> >> We include in globalDefinitions_gcc.hpp. So when building with gcc, >> globalDefinitions.hpp implicitly includes . >> >> The glibc definition of alloca is >> >> #ifdef __GNUC__ >> # define alloca(size) __builtin_alloca (size) >> #endif /* GCC. */ >> >> So that explains why we don't need any explicit include of when >> building with gcc. I expect there's something similar going on with Visual >> Studio and Xcode/clang. But apparently not with Open XLC clang. > > Ok for me. Let's hear what @kimbarrett thinks. It might be easier to get input if you create a new PR with the change. This discussion is hidden deep down in a closed PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1584947979 From gziemski at openjdk.org Tue Apr 30 15:17:10 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Tue, 30 Apr 2024 15:17:10 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> References: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> Message-ID: On Tue, 30 Apr 2024 09:58:32 GMT, Johan Sj?len wrote: >> @gerard-ziemski I originally chose "VMA" because its a clearly defined acronym in linux and has close pendants in other OSes as well. I like its brevity. >> >> If we really want to rename it, I would name it something like `IntervalTree`. Since that's what it really is - managing sets of numeric intervals with attributes. But IMHO VMATree is fine. > >>If we really want to rename it, I would name it something like IntervalTree. Since that's what it really is - managing sets of numeric intervals with attributes. But IMHO VMATree is fine. > > I prefer `VMATree` because a traditional interval tree as per Wikipedia doesn't perform any sort of merging, it's "just" a set of intervals. > > If we do expand VMA it should naively be `VirtualMemoryAreaTree`. A google search for `VMATree` is not very revealing. Anyone new to this area, will be left wondering what this data structure is (I was). I vote for more flushed out name that gets the reader pointed in the right direction (google search on `VirtualMemoryAreaTree` looks more promising). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1585009962 From azafari at openjdk.org Tue Apr 30 15:22:21 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Tue, 30 Apr 2024 15:22:21 GMT Subject: RFR: 8330076: NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v13] In-Reply-To: References: <5GDKVVPITIzIcyfm-0tKOFzFIEPBgzOe-or1eX_POns=.a5205641-139b-4749-afcc-57ddbc85e6be@github.com> Message-ID: <2qU9ixg7RaT5D5_Ct2ecqd4t5TS9ybjVBus8yEHeubo=.10078568-91ca-42e3-8b73-fede80eab78b@github.com> On Tue, 23 Apr 2024 06:31:30 GMT, David Holmes wrote: >> Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: >> >> removed extra blank line. > > This is a big change, but the pattern of the changes is quite easy to follow. > > I do have a couple of queries below. > > Thanks @dholmes-ora, I am not sure if you got all your comments addressed. Would you please, have a look at here? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18745#issuecomment-2085638740 From jkern at openjdk.org Tue Apr 30 15:22:19 2024 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 30 Apr 2024 15:22:19 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: On Tue, 30 Apr 2024 14:39:29 GMT, Magnus Ihse Bursie wrote: >> Ok for me. Let's hear what @kimbarrett thinks. > > It might be easier to get input if you create a new PR with the change. This discussion is hidden deep down in a closed PR. I will do after labor day and create a PR with this suggested solution in your JDK-8330539. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1585020690 From stuefe at openjdk.org Tue Apr 30 16:15:13 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 16:15:13 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> Message-ID: On Tue, 30 Apr 2024 15:14:31 GMT, Gerard Ziemski wrote: >>>If we really want to rename it, I would name it something like IntervalTree. Since that's what it really is - managing sets of numeric intervals with attributes. But IMHO VMATree is fine. >> >> I prefer `VMATree` because a traditional interval tree as per Wikipedia doesn't perform any sort of merging, it's "just" a set of intervals. >> >> If we do expand VMA it should naively be `VirtualMemoryAreaTree`. > > A google search for `VMATree` is not very revealing. Anyone new to this area, will be left wondering what this data structure is (I was). > > I vote for more flushed out name that gets the reader pointed in the right direction (google search on `VirtualMemoryAreaTree` looks more promising). Neither VMATree nor VirtualMemoryAreaTree are correct. VMATree has the advantage of brevity and the fact that its already ingrained in our brains, so we had conversations about it and at least one [concept document](https://gist.github.com/tstuefe/d9682b7f11b3375da27faa100f45e621) exists. VirtualMemoryAreaTree OTOH is incorrect and unclear on several levels. The tree is not a "MemoryTree" in its current form, since it tracks offsets into a file. And it's not "Virtual" by any stretch of the word. It would join the zoo of confusingly named classes like VirtualMemorySpace, which is neither more nor less Virtual than its parent class ReservedSpace. If we rename it at all, I vote for IntervalTree. Because that's precisely what it is. It tracks intervals with attributes. And if templatized, we can reuse it for any kind of numerical region tracking and any kind of index types that we want. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1585138767 From eastigeevich at openjdk.org Tue Apr 30 16:25:04 2024 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 30 Apr 2024 16:25:04 GMT Subject: RFR: 8326085: Remove unnecessary UpcallContext constructor In-Reply-To: References: Message-ID: On Fri, 26 Apr 2024 17:42:48 GMT, Sonia Zaldana Calles wrote: > Hi all, > > This PR removes the explicit constructor to UpcallContext (hotspot/share/prims/upcallLinker.cpp) that was added as workaround for [8286891](https://bugs.openjdk.org/browse/JDK-8286891). > > The minimum required version of XLC has since been bumped in [8325880](https://bugs.openjdk.org/browse/JDK-8325880), so we can remove this. > > Thanks, > Sonia @SoniaZaldana, Have you tested your change does not trigger JDK-8286891 or any other errors? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18982#issuecomment-2085858264 From kbarrett at openjdk.org Tue Apr 30 16:39:13 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 30 Apr 2024 16:39:13 GMT Subject: RFR: 8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc [v3] In-Reply-To: References: <-XeYeJ0OEmauTYsEoSXxzRmQXSKMOLw87GSpqDnEmug=.5cb7e71f-fea6-4a84-8260-5f515d3d3810@github.com> <18WjPZeDIWkxGIB0BJgyDg5VipCtY4EOlWmIGPWZGCw=.b50cf4a9-61a4-421e-97eb-3dbac94c14f9@github.com> <_xcaF7UUDHA11loD89Dz871vAQgRqMzCdPkahFDfKv8=.a2c6dcbe-5942-4fb7-9d8b-4239ea048e56@github.com> <76P7uKTuqo7IKYr5yBP4Vx1SS0AcEXC_6vDAU6LfIzo=.d939556f-6fab-4009-820b-821376bfdb7c@github.com> <6aR5nvKhz28A1CkxtaAD9CwTjILBjwZrrRwP3988oEc=.72203104-2ae5-40ff-bd87-168b684446e6@ github.com> Message-ID: <1EgO9Z2UdqtUYN2oNClYl_evpBDw1asCxQRWPk0w_6E=.db209d00-845d-44bc-9ca1-e5c533087638@github.com> On Tue, 30 Apr 2024 15:19:47 GMT, Joachim Kern wrote: >> It might be easier to get input if you create a new PR with the change. This discussion is hidden deep down in a closed PR. > > I will do after labor day and create a PR with this suggested solution in your JDK-8330539. I think I still prefer just unconditionally including in globalDefinitions_gcc.hpp. For gcc/clang we are using `-std=c++14` + `-D_GNU_SOURCE` instead of `-std=gnu++14`. I forget exactly why. I don't really want to be messing with `__STRICT_ANSI__`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1585181094 From gziemski at openjdk.org Tue Apr 30 16:57:11 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Tue, 30 Apr 2024 16:57:11 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> Message-ID: <6zJC0o26l14DRheFDsjmnUnuytxe-aEEz8mOFrCTk1o=.3ab3f0ef-631d-4fc6-9eee-d7d210f11ea7@github.com> On Tue, 30 Apr 2024 16:12:31 GMT, Thomas Stuefe wrote: >> A google search for `VMATree` is not very revealing. Anyone new to this area, will be left wondering what this data structure is (I was). >> >> I vote for more flushed out name that gets the reader pointed in the right direction (google search on `VirtualMemoryAreaTree` looks more promising). > > Neither VMATree nor VirtualMemoryAreaTree are correct. VMATree has the advantage of brevity and the fact that its already ingrained in our brains, so we had conversations about it and at least one [concept document](https://gist.github.com/tstuefe/d9682b7f11b3375da27faa100f45e621) exists. > > VirtualMemoryAreaTree OTOH is incorrect and unclear on several levels. > > The tree is not a "MemoryTree" in its current form, since it tracks offsets into a file. And it's not "Virtual" by any stretch of the word. It would join the zoo of confusingly named classes like VirtualMemorySpace, which is neither more nor less Virtual than its parent class ReservedSpace. > > If we rename it at all, I vote for IntervalTree. Because that's precisely what it is. It tracks intervals with attributes. And if templatized, we can reuse it for any kind of numerical region tracking and any kind of index types that we want. Googling for `IntervalTree` returns useful links. I like it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1585203185 From gziemski at openjdk.org Tue Apr 30 17:14:10 2024 From: gziemski at openjdk.org (Gerard Ziemski) Date: Tue, 30 Apr 2024 17:14:10 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: <6zJC0o26l14DRheFDsjmnUnuytxe-aEEz8mOFrCTk1o=.3ab3f0ef-631d-4fc6-9eee-d7d210f11ea7@github.com> References: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> <6zJC0o26l14DRheFDsjmnUnuytxe-aEEz8mOFrCTk1o=.3ab3f0ef-631d-4fc6-9eee-d7d210f11ea7@github.com> Message-ID: On Tue, 30 Apr 2024 16:54:30 GMT, Gerard Ziemski wrote: >> Neither VMATree nor VirtualMemoryAreaTree are correct. VMATree has the advantage of brevity and the fact that its already ingrained in our brains, so we had conversations about it and at least one [concept document](https://gist.github.com/tstuefe/d9682b7f11b3375da27faa100f45e621) exists. >> >> VirtualMemoryAreaTree OTOH is incorrect and unclear on several levels. >> >> The tree is not a "MemoryTree" in its current form, since it tracks offsets into a file. And it's not "Virtual" by any stretch of the word. It would join the zoo of confusingly named classes like VirtualMemorySpace, which is neither more nor less Virtual than its parent class ReservedSpace. >> >> If we rename it at all, I vote for IntervalTree. Because that's precisely what it is. It tracks intervals with attributes. And if templatized, we can reuse it for any kind of numerical region tracking and any kind of index types that we want. > > Googling for `IntervalTree` returns useful links. I like it. If the class is general enough, then perhaps it should be moved into `shared/utilities` so others can use it as well? A candidate for a follow up later? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1585221888 From stuefe at openjdk.org Tue Apr 30 17:37:16 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 17:37:16 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v34] In-Reply-To: References: <4bHEEa5QHK6rN2pH6ftWYE---OlmvQNQ_FjJsfweYjI=.ad832040-2d3c-4847-ba1f-33b9d8cf8c9f@github.com> Message-ID: On Tue, 30 Apr 2024 09:54:31 GMT, Johan Sj?len wrote: >>> Sure, the allocator isn't responsible for seeding the nodes. It was simply convenient that the class which does allocation also happens to hold the seed, so I could have them do both together. >>> >>> I came to the opposite conclusion that you did: I didn't want to have a global shared state seed as that makes me have to think about the behavior of the RNG in the presence of multiple threads and instances of the treap. >>> >> >> Yeah sure, but now you get collisions if you merge two trees together that both have had nodes added before the merge. Because both Treaps will have followed the same rng sequence - after all, AFAICS your seed is always the same. >> >> Either use a global seed, or initialize each Treap seed randomly. The former needs cas, the latter needs an os::random call on each constructor. Pick your poison :) I also think you do not need the seed constructor argument if you do that. > >> Yeah sure, but now you get collisions if you merge two trees together that both have had nodes added before the merge. Because both Treaps will have followed the same rng sequence - after all, AFAICS your seed is always the same. > > Alright, yeah, that's a problem if we make that a supported API. `os::random` for each `Treap` constructor and then saving that as the initial seed seems reasonable to me and like a clear improvement. > >>I also think you do not need the seed constructor argument if you do that. > > That's true, it might be nice to have a seed constructor argument if you want reproducible tests for example. Cool for me. Feel free to close this conversation :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1585248543 From stuefe at openjdk.org Tue Apr 30 17:37:16 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 17:37:16 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: References: <0z8n2nEWkKvUSaSN_UwDFykvB5xENEVGDfr0p4_SKw8=.5731dc50-c0cf-48e3-9f74-28db684a4ebf@github.com> <6zJC0o26l14DRheFDsjmnUnuytxe-aEEz8mOFrCTk1o=.3ab3f0ef-631d-4fc6-9eee-d7d210f11ea7@github.com> Message-ID: On Tue, 30 Apr 2024 17:11:35 GMT, Gerard Ziemski wrote: >> Googling for `IntervalTree` returns useful links. I like it. > > If the class is general enough, then perhaps it should be moved into `shared/utilities` so others can use it as well? A candidate for a follow up later? Yea, at least long term. For me its fine in another RFE too, or if we see a second use for this class. (One possibility I mentioned to StefanK recently was using this class to track zgc memory pages, which is currently done with linked lists). Up to you, @jdksjolen ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1585247218 From stuefe at openjdk.org Tue Apr 30 17:48:10 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 30 Apr 2024 17:48:10 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v54] In-Reply-To: <8tX-E6rhvM3r0MhHkAmoCaxzyUrQ6ohmV8UDYMdokms=.77f5daee-f371-4ab8-ad98-337cb4fb4111@github.com> References: <8tX-E6rhvM3r0MhHkAmoCaxzyUrQ6ohmV8UDYMdokms=.77f5daee-f371-4ab8-ad98-337cb4fb4111@github.com> Message-ID: On Tue, 30 Apr 2024 09:32:27 GMT, Johan Sj?len wrote: > > Started looking at Treap. More tomorrow. > > About the Treap. Looking at this closer, I don't understand why you don't follow the more canonical way of implementing a treap: > > ``` > > * Insert: find lexical position for node x, then do tree rotations until the node priority is in line > > > > * Delete: find node, and as long as its not a leaf, rotate until it is. Then cut node out. > > ``` > > > > Would that not be easier to understand, and faster too? No need for recursion either. We would not need merge and split at all. All we need is insert and delete, after all, we don't need any bulk operations on nodes. > > There seems to be two common ways: > > * The merge/split thing that I did https://cp-algorithms.com/data_structures/treap.html https://www.geeksforgeeks.org/implementation-of-search-insert-and-delete-in-treap/ > > * The rotation implementation: https://yourbasic.org/algorithms/treap/ https://www.drdobbs.com/windows/treaps-in-java/184410231 https://opendatastructures.org/ods-java/7_2_Treap_Randomized_Binary.html > > > Both of these are typically implemented in a recursive manner, though the last link is iterative. I found the merge/split to be easier to understand and since the recursion is bounded on the order of `log(n)` not doing it iteratively is fine by me. > > If it's a deal breaker, I'll rewrite it iteratively using rotations. Hmm, not a deal breaker, but the recursive merge and splits gives me headaches :=) If I am not mistaken, it also seems more expensive? A remove node needs two splits and a merge, both seem to be dependent on tree depth. Removing the node via find-and-rotate-til-its-a-leaf only needs one tree traversal (first find the node, then rotate down until its a leaf). > > > Then, the Treap definitly needs an ASSERT-only verify to check consistency, and that needs to be called periodically. > > Okay, so this is irrelevant of how the treap is implemented? I guess it'd check for: > > * Non-degeneration of the depth of the tree (approximately log n) > > * Uniqueness of keys > > * Anything else?? No, I think that covers it. And then stomp on that thing over and over in gtests, and use this function to validate correctness. > > > > TreapNode: > > Aesthetical nits: Its a mix right now. You have getters, but then the owning Treap has private access and uses that. I see you expose the TreapNode to outside access. In that case, I would prefer if you would use getters and setters consistently, and remove the friend declaration. Style-wise, the code could be more condensed. > > I'll look into that. > > > File naming: I see we have inconsistencies. We name some files "nmtXXX", some not. All code lives inside "nmt", so the prefix could be superfluous. Just a small nit. > > Same here. I think that file names have to be unique across the whole of Hotspot (adlc and libadt being exceptions IIRC), so prepending "nmt" to datastructures seemed like a good way to ensure that. Really, the memory file tracker should still not have the `nmt` prepended regardless. > > Addressing the rest of your comments separately. > > Thanks for having another look. Sure thing ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2086221922 From kbarrett at openjdk.org Tue Apr 30 18:15:57 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 30 Apr 2024 18:15:57 GMT Subject: RFR: 8331352: error: template-id not allowed for constructor/destructor in C++20 In-Reply-To: References: Message-ID: <_cyaf8AReHZbLvVCggX4Or58Vy6jzmDA6euy77VTh1s=.dd6859be-efcc-4c21-ac27-47e22ef28658@github.com> On Tue, 30 Apr 2024 02:01:01 GMT, Jan Kratochvil wrote: > When compiling trunk (819f3d6fc70ff6fe54ac5f9033c17c3dd4326aa5 2024-04-29) by gcc-14.0.1-0.15.fc40.x86_64 there are many errors: > > In file included from src/hotspot/share/memory/allocation.hpp:30, > from src/hotspot/share/ci/ciBaseObject.hpp:29, > from src/hotspot/share/ci/ciMetadata.hpp:28, > from src/hotspot/share/ci/ciType.hpp:28, > from src/hotspot/share/ci/ciKlass.hpp:28, > from src/hotspot/share/ci/ciArrayKlass.hpp:28, > from src/hotspot/share/ci/ciArray.hpp:28, > from src/hotspot/share/ci/compilerInterface.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.hpp:28, > from src/hotspot/share/compiler/abstractCompiler.cpp:25: > src/hotspot/share/utilities/linkedlist.hpp:85:15: error: template-id not allowed for constructor in C++20 [-Werror=template-id-cdtor] > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > src/hotspot/share/utilities/linkedlist.hpp:85:15: note: remove the ?< >? > 85 | NONCOPYABLE(LinkedList); > | ^~~~~~~~~~~~~ > src/hotspot/share/utilities/globalDefinitions.hpp:87:26: note: in definition of macro ?NONCOPYABLE? > 87 | #define NONCOPYABLE(C) C(C const&) = delete; C& operator=(C const&) = delete /* next token must be ; */ > | ^ > > In file included from src/hotspot/share/gc/z/zGranuleMap.inline.hpp:30, > from src/hotspot/share/gc/z/zForwardingTable.inline.hpp:32, > from src/hotspot/share/gc/z/zHeap.inline.hpp:30, > from src/hotspot/share/gc/z/zGeneration.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrier.inline.hpp:30, > from src/hotspot/share/gc/z/zBarrierSet.inline.hpp:31, > from src/hotspot/share/gc/shared/barrierSetConfig.inline.hpp:44, > from src/hotspot/share/oops/access.inline.hpp:31, > from src/hotspot/share/memory/iterator.inline.hpp:32, > from src/hotspot/share/oops/oop.inline.hpp:31, > from src/hotspot/share/compiler/abstractDisassembler.cpp:32: > src/hotspot/share/gc/z/zArray.inline.hpp:99:21: error: template-id not allowed f... > /integrate HotSpot changes generally require two reviewers rather than the default one reviewer requirement. https://openjdk.org/guide/#hotspot-development Skara doesn't know that, so will prematurely mark a HotSpot change as ready. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19009#issuecomment-2086373829 From amenkov at openjdk.org Tue Apr 30 19:03:53 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 30 Apr 2024 19:03:53 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v3] In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 01:56:13 GMT, Serguei Spitsyn wrote: >> This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. >> For performance analysis, experiments and numbers, please, see the comment below this description. >> >> The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. >> >> Testing: >> - Tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: correct comments related to VTMS transition counters Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18937#pullrequestreview-2032255537 From cjplummer at openjdk.org Tue Apr 30 19:34:53 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 30 Apr 2024 19:34:53 GMT Subject: RFR: 8330969: scalability issue with loaded JVMTI agent [v3] In-Reply-To: References: Message-ID: On Tue, 30 Apr 2024 01:56:13 GMT, Serguei Spitsyn wrote: >> This is a fix of the following JVMTI scalability issue. A closed benchmark with millions of virtual threads shows 3X-4X overhead when a JVMTI agent has been loaded. For instance, this is observable when an app is executed under control of the Oracle Studio `collect` utility. >> For performance analysis, experiments and numbers, please, see the comment below this description. >> >> The fix is to replace the global counter `_VTMS_transition_count` with the mark bit `_VTMS_transition_mark` in each `JavaThread`'. >> >> Testing: >> - Tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: correct comments related to VTMS transition counters Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18937#pullrequestreview-2032369700 From amenkov at openjdk.org Tue Apr 30 23:25:04 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 30 Apr 2024 23:25:04 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly [v2] In-Reply-To: References: Message-ID: > Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method > > Testing: tier1-6 Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: renamed current_thread to current ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18986/files - new: https://git.openjdk.org/jdk/pull/18986/files/f472f669..d5d614bc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18986&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18986&range=00-01 Stats: 131 lines in 2 files changed: 0 ins; 1 del; 130 mod Patch: https://git.openjdk.org/jdk/pull/18986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18986/head:pull/18986 PR: https://git.openjdk.org/jdk/pull/18986 From amenkov at openjdk.org Tue Apr 30 23:48:02 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 30 Apr 2024 23:48:02 GMT Subject: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly [v3] In-Reply-To: References: Message-ID: > Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method > > Testing: tier1-6 Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: renamed current_thread tp current ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18986/files - new: https://git.openjdk.org/jdk/pull/18986/files/d5d614bc..46026322 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18986&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18986&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18986/head:pull/18986 PR: https://git.openjdk.org/jdk/pull/18986