From kbarrett at openjdk.org Fri Mar 1 00:39:51 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 1 Mar 2024 00:39:51 GMT Subject: RFR: JDK-8326974: ODR violation in macroAssembler_aarch64.cpp In-Reply-To: References: <750f3_7W0J6J2z0XbgDhrGgOxOrUtWkVr4Clqg_L2yM=.a298711c-02cc-4998-8da0-97e012549e5d@github.com> Message-ID: On Thu, 29 Feb 2024 15:41:04 GMT, Andrew Haley wrote: > > > I think the patch suggested in bug was better, and in line with [Hotspot Style Guide](https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md), which says: "Global names must be unique, to avoid [One Definition Rule](https://en.cppreference.com/w/cpp/language/definition) (ODR) violations. A common prefixing scheme for related global names is often used. (This is instead of using namespaces, which are mostly avoided in HotSpot.)" > > > > > > See also the section "Namespaces", and in particular the specific admonition against using anonymous namespaces. > > Mmm, that's a shame. It seems to me that an anonymous namespace is best here, given that the classes should not be visible outside this translation unit, but implementation bugs in debuggers and linkers are a show stopper. I completely agree :( My experience on a different project was that anonymous namespaces had some significant benefits, including potentially having such classes treated as implicitly final, with the optimizations that can come from that. I haven't looked at how well "modern" debugger versions cope with anonymous namespaces. It seems like problems there should have been fixed by now - anonymous namespaces are hardly new. I was surprised that the usage proposed here ran into linker problems though. It would be nice if someone with good windows access could investigate that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18056#issuecomment-1972222455 From gli at openjdk.org Fri Mar 1 00:57:53 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 1 Mar 2024 00:57:53 GMT Subject: RFR: JDK-8326974: ODR violation in macroAssembler_aarch64.cpp [v3] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 17:16:04 GMT, Andrew Haley wrote: >> This is a slightly different patch from the one suggested by the bug reporter. It doesn't make any sense to export those classes globally, so I've given them internal linkage. > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8326974: ODR violation in macroAssembler_aarch64.cpp LGTM. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18056#pullrequestreview-1910128702 From jwaters at openjdk.org Fri Mar 1 01:34:43 2024 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 1 Mar 2024 01:34:43 GMT Subject: RFR: JDK-8326974: ODR violation in macroAssembler_aarch64.cpp In-Reply-To: References: <750f3_7W0J6J2z0XbgDhrGgOxOrUtWkVr4Clqg_L2yM=.a298711c-02cc-4998-8da0-97e012549e5d@github.com> Message-ID: On Fri, 1 Mar 2024 00:37:33 GMT, Kim Barrett wrote: > > > > I think the patch suggested in bug was better, and in line with [Hotspot Style Guide](https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md), which says: "Global names must be unique, to avoid [One Definition Rule](https://en.cppreference.com/w/cpp/language/definition) (ODR) violations. A common prefixing scheme for related global names is often used. (This is instead of using namespaces, which are mostly avoided in HotSpot.)" > > > > > > > > > See also the section "Namespaces", and in particular the specific admonition against using anonymous namespaces. > > > > > > Mmm, that's a shame. It seems to me that an anonymous namespace is best here, given that the classes should not be visible outside this translation unit, but implementation bugs in debuggers and linkers are a show stopper. > > I completely agree :( My experience on a different project was that anonymous namespaces had some significant benefits, including potentially having such classes treated as implicitly final, with the optimizations that can come from that. > > I haven't looked at how well "modern" debugger versions cope with anonymous namespaces. It seems like problems there should have been fixed by now - anonymous namespaces are hardly new. I was surprised that the usage proposed here ran into linker problems though. It would be nice if someone with good windows access could investigate that. It seemed to me that macOS ARM64 also experienced failures earlier, not just Windows, unless the test failure is unrelated? (I'm not very familiar with the JDK's testing framework) I could help look into it, since I primarily develop on Windows, but I'd probably need time since I'm a little busy at the moment ------------- PR Comment: https://git.openjdk.org/jdk/pull/18056#issuecomment-1972284505 From dlong at openjdk.org Fri Mar 1 02:38:52 2024 From: dlong at openjdk.org (Dean Long) Date: Fri, 1 Mar 2024 02:38:52 GMT Subject: RFR: 8325095: C2: bailout message broken: ResourceArea allocated string used after free [v5] In-Reply-To: References: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> Message-ID: On Fri, 23 Feb 2024 15:43:07 GMT, Emanuel Peter wrote: >> **Problem** >> Bailout `failure_reason` strings used to be either `nullptr` or a static string. But with [JDK-8303951](https://bugs.openjdk.org/browse/JDK-8303951) I had introduced dynamic strings (e.g. formatted using `stringStream::as_string()`). These dynamic strings were ResourceArea allocated, and have a very limited life-span, i.e. withing the most recent ResourceMark scope. The pointer is passed to `Compile::_failure_reason`, which already might leave that ResourceMark scope, and the pointer is invalid. And then the pointer is further passed to `ciEnv::_failure_reason`. And finally, it even gets passed up to `CompileBroker::invoke_compiler_on_method`. When the string is finally printed for the bailout message, we already left original `ResourceMark` scope, and also the `Compile` and `ciEnv` scopes. The pointer now points to basically random data. >> >> **Solution** >> Whenever a string is passed to an outer scope, I make a CHeap copy, and the outer scope is the owner of that copy. >> This way every scope is the owner of its own copy, and can allocate and deallocate according to its own strategy safely. >> >> I introduced a utility class `CHeapStringHolder`: >> - `set`: make local copy on CHeap. >> - `clear`: free local copy. We before `set`, and in the destructor. Thus, when the holder goes out of scope, the memory is automatically freed. >> >> We have these 4 scopes: >> - `ResourceMark`: It allocates from `ResourcArea`, and deallocation is taken care at the end of the ResourceMark scope. >> - `Compile`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while the `ResourceMark` is live. >> - `ciEnv`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while `Compile` is live. >> - `CompileTask`: We used to just set `failure_reason`, which assumes that the string is static or elsewhere managed. Now I use pattern available in other places of `CompileBroker::invoke_compiler_on_method`: duplicate the string with `os::strdup` from some other scope, and set `reason_on_C_heap = true`, which means that we are the owner of that copy, and are responsible for freeing it later. Not sure if that is a nice pattern, but I don't want to refactor that code and it does what I need it to do. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > include precompiled.hpp, otherwise windows build fails Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17710#pullrequestreview-1910218084 From fyang at openjdk.org Fri Mar 1 02:45:56 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 1 Mar 2024 02:45:56 GMT Subject: RFR: 8327058: RISC-V: make Zcb experimental In-Reply-To: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> References: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> Message-ID: On Thu, 29 Feb 2024 17:31:52 GMT, Hamlin Li wrote: > Hi, > Can you review the patch to add a flag for Zcb extension to enable user to control it? > Thanks. > > As discussed in [1], it's good to give a flag to control Zcb, and currently it's good to make it experimental. > > [1] https://github.com/openjdk/jdk/pull/17698#discussion_r1505364098 Thanks. src/hotspot/cpu/riscv/globals_riscv.hpp line 108: > 106: product(bool, UseZbb, false, "Use Zbb instructions") \ > 107: product(bool, UseZbs, false, "Use Zbs instructions") \ > 108: product(bool, UseZcb, false, EXPERIMENTAL, "Use Zcb instructions") \ Nit: Can you move this after the line for `UseZfh`. I would like to group EXPERIMENTAL options together. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18070#pullrequestreview-1910221663 PR Review Comment: https://git.openjdk.org/jdk/pull/18070#discussion_r1508416995 From jbhateja at openjdk.org Fri Mar 1 06:01:45 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 1 Mar 2024 06:01:45 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Tue, 27 Feb 2024 21:13:07 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update description of Poly1305 algo src/hotspot/cpu/x86/assembler_x86.cpp line 5146: > 5144: > 5145: void Assembler::vpmadd52luq(XMMRegister dst, XMMRegister src1, Address src2, int vector_len) { > 5146: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); What if vector length is 128 bit and target does not support AVX_IFMA ? AVX512_IFMA + AVX512_VL should still be still be sufficient to execute 52 bit MACs. src/hotspot/cpu/x86/assembler_x86.cpp line 5181: > 5179: > 5180: void Assembler::vpmadd52huq(XMMRegister dst, XMMRegister src1, Address src2, int vector_len) { > 5181: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); What if vector length is 128 bit and target does not support AVX_IFMA ? AVX512_IFMA + AVX512_VL should still be still be sufficient to execute 52 bit MACs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508515255 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508514777 From jbhateja at openjdk.org Fri Mar 1 06:07:56 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 1 Mar 2024 06:07:56 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Tue, 27 Feb 2024 21:13:07 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update description of Poly1305 algo src/hotspot/cpu/x86/assembler_x86.cpp line 5156: > 5154: > 5155: void Assembler::vpmadd52luq(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { > 5156: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); What if vector length is 128 bit and target does not support AVX_IFMA ? AVX512_IFMA + AVX512_VL should still be still be sufficient to execute 52 bit MACs. Please add appropriate assertions to explicitly check AVX512VL. src/hotspot/cpu/x86/assembler_x86.cpp line 5191: > 5189: > 5190: void Assembler::vpmadd52huq(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { > 5191: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); Same as above. src/hotspot/cpu/x86/assembler_x86.cpp line 9101: > 9099: > 9100: void Assembler::vpunpckhqdq(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { > 9101: assert(UseAVX > 0, "requires some form of AVX"); Add appropriate AVX512VL assertion. src/hotspot/cpu/x86/assembler_x86.cpp line 9115: > 9113: > 9114: void Assembler::vpunpcklqdq(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { > 9115: assert(UseAVX > 0, "requires some form of AVX"); Add appropriate AVX512VL assertion ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508516820 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508518721 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508517680 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508517933 From jbhateja at openjdk.org Fri Mar 1 06:11:56 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 1 Mar 2024 06:11:56 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Tue, 27 Feb 2024 21:13:07 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update description of Poly1305 algo Changes requested by jbhateja (Reviewer). src/hotspot/cpu/x86/assembler_x86.cpp line 9115: > 9113: > 9114: void Assembler::vpunpcklqdq(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { > 9115: assert(UseAVX > 0, "requires some form of AVX"); Add appropriate AVX512VL assertion ------------- PR Review: https://git.openjdk.org/jdk/pull/17881#pullrequestreview-1910377861 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508520924 From epeter at openjdk.org Fri Mar 1 06:12:12 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 1 Mar 2024 06:12:12 GMT Subject: RFR: 8325095: C2: bailout message broken: ResourceArea allocated string used after free [v6] In-Reply-To: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> References: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> Message-ID: > **Problem** > Bailout `failure_reason` strings used to be either `nullptr` or a static string. But with [JDK-8303951](https://bugs.openjdk.org/browse/JDK-8303951) I had introduced dynamic strings (e.g. formatted using `stringStream::as_string()`). These dynamic strings were ResourceArea allocated, and have a very limited life-span, i.e. withing the most recent ResourceMark scope. The pointer is passed to `Compile::_failure_reason`, which already might leave that ResourceMark scope, and the pointer is invalid. And then the pointer is further passed to `ciEnv::_failure_reason`. And finally, it even gets passed up to `CompileBroker::invoke_compiler_on_method`. When the string is finally printed for the bailout message, we already left original `ResourceMark` scope, and also the `Compile` and `ciEnv` scopes. The pointer now points to basically random data. > > **Solution** > Whenever a string is passed to an outer scope, I make a CHeap copy, and the outer scope is the owner of that copy. > This way every scope is the owner of its own copy, and can allocate and deallocate according to its own strategy safely. > > I introduced a utility class `CHeapStringHolder`: > - `set`: make local copy on CHeap. > - `clear`: free local copy. We before `set`, and in the destructor. Thus, when the holder goes out of scope, the memory is automatically freed. > > We have these 4 scopes: > - `ResourceMark`: It allocates from `ResourcArea`, and deallocation is taken care at the end of the ResourceMark scope. > - `Compile`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while the `ResourceMark` is live. > - `ciEnv`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while `Compile` is live. > - `CompileTask`: We used to just set `failure_reason`, which assumes that the string is static or elsewhere managed. Now I use pattern available in other places of `CompileBroker::invoke_compiler_on_method`: duplicate the string with `os::strdup` from some other scope, and set `reason_on_C_heap = true`, which means that we are the owner of that copy, and are responsible for freeing it later. Not sure if that is a nice pattern, but I don't want to refactor that code and it does what I need it to do. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: remove unnecessary include ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17710/files - new: https://git.openjdk.org/jdk/pull/17710/files/894a642c..7cf5c20d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17710&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17710&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17710.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17710/head:pull/17710 PR: https://git.openjdk.org/jdk/pull/17710 From epeter at openjdk.org Fri Mar 1 06:12:12 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 1 Mar 2024 06:12:12 GMT Subject: RFR: 8325095: C2: bailout message broken: ResourceArea allocated string used after free [v5] In-Reply-To: References: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> Message-ID: On Fri, 1 Mar 2024 06:05:38 GMT, David Holmes wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> include precompiled.hpp, otherwise windows build fails > > src/hotspot/share/compiler/cHeapStringHolder.cpp line 26: > >> 24: >> 25: #include "precompiled.hpp" >> 26: #include "runtime/orderAccess.hpp" > > You don't need this include Your are right! Removed it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17710#discussion_r1508520550 From dholmes at openjdk.org Fri Mar 1 06:12:12 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 1 Mar 2024 06:12:12 GMT Subject: RFR: 8325095: C2: bailout message broken: ResourceArea allocated string used after free [v5] In-Reply-To: References: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> Message-ID: On Fri, 23 Feb 2024 15:43:07 GMT, Emanuel Peter wrote: >> **Problem** >> Bailout `failure_reason` strings used to be either `nullptr` or a static string. But with [JDK-8303951](https://bugs.openjdk.org/browse/JDK-8303951) I had introduced dynamic strings (e.g. formatted using `stringStream::as_string()`). These dynamic strings were ResourceArea allocated, and have a very limited life-span, i.e. withing the most recent ResourceMark scope. The pointer is passed to `Compile::_failure_reason`, which already might leave that ResourceMark scope, and the pointer is invalid. And then the pointer is further passed to `ciEnv::_failure_reason`. And finally, it even gets passed up to `CompileBroker::invoke_compiler_on_method`. When the string is finally printed for the bailout message, we already left original `ResourceMark` scope, and also the `Compile` and `ciEnv` scopes. The pointer now points to basically random data. >> >> **Solution** >> Whenever a string is passed to an outer scope, I make a CHeap copy, and the outer scope is the owner of that copy. >> This way every scope is the owner of its own copy, and can allocate and deallocate according to its own strategy safely. >> >> I introduced a utility class `CHeapStringHolder`: >> - `set`: make local copy on CHeap. >> - `clear`: free local copy. We before `set`, and in the destructor. Thus, when the holder goes out of scope, the memory is automatically freed. >> >> We have these 4 scopes: >> - `ResourceMark`: It allocates from `ResourcArea`, and deallocation is taken care at the end of the ResourceMark scope. >> - `Compile`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while the `ResourceMark` is live. >> - `ciEnv`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while `Compile` is live. >> - `CompileTask`: We used to just set `failure_reason`, which assumes that the string is static or elsewhere managed. Now I use pattern available in other places of `CompileBroker::invoke_compiler_on_method`: duplicate the string with `os::strdup` from some other scope, and set `reason_on_C_heap = true`, which means that we are the owner of that copy, and are responsible for freeing it later. Not sure if that is a nice pattern, but I don't want to refactor that code and it does what I need it to do. > > Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: > > include precompiled.hpp, otherwise windows build fails src/hotspot/share/compiler/cHeapStringHolder.cpp line 26: > 24: > 25: #include "precompiled.hpp" > 26: #include "runtime/orderAccess.hpp" You don't need this include ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17710#discussion_r1508518748 From fyang at openjdk.org Fri Mar 1 07:03:53 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 1 Mar 2024 07:03:53 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v2] In-Reply-To: References: <8OQQo7cqVshZlOmFeWCLp_VrD3rizjsPDU4a1kGxokc=.75d8279c-ae2f-42fc-89e7-ccebb826e142@github.com> Message-ID: On Thu, 29 Feb 2024 15:13:52 GMT, Hamlin Li wrote: >>> That make sense to me. In the long run, we should depend on hwprobe + vendor specific code to auto-enable or disable certain RV extensions on certain hardwares. But I agree on that there should be some options of DIAGNOSTIC kind to help diagnose issues and find workarounds. I am OK to make this option an EXPERIMENTAL one at the current stage. But we should turn them into the DIAGNOSTIC kind in the future when we have the hardwares and hwprobe is capable to satisfy our needs. Mind you, we are lacking one option for https://github.com/openjdk/jdk/pull/17122 in that respect. Let me know if you going to add one. >> >> Agreed on all of that! The lack of currently and widely available hardware with more of these extensions is what makes me want to keep them EXPERIMENTAL. But once we validated that these work on commercially available hardware, I would indeed move them out of EXPERIMENTAL, and I agree DIAGNOSTIC makes for a fine change. >> >> Another question will be about backporting the EXPERIMENTAL -> DIAGNOSTIC. We can revisit later but I think we should be fairly proactive about that when the case arises. Do you see any issues that may come from it? Possibly we expect that's a breaking change as users would need to switch from `-XX:+UseExperimentalFeatures` to `-XX:+UseDiagnosticsFeatures`. > >> Mind you, we are lacking one option for https://github.com/openjdk/jdk/pull/17122 in that respect. Let me know if you going to add one. > > @RealFYang I can do that. FYI, tested on lichipi and VF2, seems both does not support Zcb yet, tracked by https://bugs.openjdk.org/browse/JDK-8327058 > Another question will be about backporting the EXPERIMENTAL -> DIAGNOSTIC. We can revisit later but I think we should be fairly proactive about that when the case arises. Do you see any issues that may come from it? Possibly we expect that's a breaking change as users would need to switch from `-XX:+UseExperimentalFeatures` to `-XX:+UseDiagnosticsFeatures`. I didn't see a big issue as they are EXPERIMENTAL options for now which are subject to change. User are not supposed to use them casually as they use normal product VM options. I am thinking that the backporting to LTS versions should happen only after they are turned into DIAGNOSTIC in JDK mainline some day. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1508555665 From jbhateja at openjdk.org Fri Mar 1 08:23:54 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 1 Mar 2024 08:23:54 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Tue, 27 Feb 2024 21:13:07 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > Update description of Poly1305 algo Hi @vamsi-parasa , apart from above assertion check modifications, patch looks good to me. src/hotspot/cpu/x86/assembler_x86.cpp line 5148: > 5146: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); > 5147: InstructionMark im(this); > 5148: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); uses_vl should be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 5157: > 5155: void Assembler::vpmadd52luq(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { > 5156: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); > 5157: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); uses_vl should be false. src/hotspot/cpu/x86/assembler_x86.cpp line 5183: > 5181: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); > 5182: InstructionMark im(this); > 5183: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); uses_vl should be false. ------------- Changes requested by jbhateja (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17881#pullrequestreview-1910555763 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508637115 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508637945 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1508638146 From thomas.schatzl at oracle.com Fri Mar 1 08:48:04 2024 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 1 Mar 2024 09:48:04 +0100 Subject: =?UTF-8?Q?Re=3A_CFV=3A_New_HotSpot_Group_Member=3A_Johan_Sj=C3=B6l?= =?UTF-8?B?w6lu?= In-Reply-To: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> References: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> Message-ID: Vote: yes On 22.02.24 13:14, Jesper Wilhelmsson wrote: > I hereby nominate Johan Sj?l?n (jsjolen) to Membership in the HotSpot Group. > > Johan is a Reviewer in the JDK project, and a member of the Oracle JVM Runtime team. Johan has done major work in the Unified Logging framework and is today the de facto owner of that area. He has done major cleanups and bug fixes throughout the JVM, and does a lot of work in the Native Memory Tracking area. > > Votes are due by Thursday March 7. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote From martin.doerr at sap.com Fri Mar 1 08:58:23 2024 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 1 Mar 2024 08:58:23 +0000 Subject: =?iso-8859-1?Q?Re:_CFV:_New_HotSpot_Group_Member:_Johan_Sj=F6l=E9n?= In-Reply-To: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> References: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> Message-ID: Vote: yes Best regards, Martin Am 22.02.24, 19:14 schrieb "hotspot-dev" : I hereby nominate Johan Sj?l?n (jsjolen) to Membership in the HotSpot Group. Johan is a Reviewer in the JDK project, and a member of the Oracle JVM Runtime team. Johan has done major work in the Unified Logging framework and is today the de facto owner of that area. He has done major cleanups and bug fixes throughout the JVM, and does a lot of work in the Native Memory Tracking area. Votes are due by Thursday March 7. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- An HTML attachment was scrubbed... URL: From gli at openjdk.org Fri Mar 1 10:40:51 2024 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 1 Mar 2024 10:40:51 GMT Subject: RFR: 8327058: RISC-V: make Zcb experimental In-Reply-To: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> References: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> Message-ID: On Thu, 29 Feb 2024 17:31:52 GMT, Hamlin Li wrote: > Hi, > Can you review the patch to add a flag for Zcb extension to enable user to control it? > Thanks. > > As discussed in [1], it's good to give a flag to control Zcb, and currently it's good to make it experimental. > > [1] https://github.com/openjdk/jdk/pull/17698#discussion_r1505364098 Generally, a new option needs a CSR. But I notice that many options added in riscv-port didn't have their CSRs, so I would be OK without CSR. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18070#pullrequestreview-1910835349 From sspitsyn at openjdk.org Fri Mar 1 11:04:58 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Mar 2024 11:04:58 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v19] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: - review: update comment in threads.hpp - fix deadlock with carrier threads starvation in ObjectMonitorUsage test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17680/files - new: https://git.openjdk.org/jdk/pull/17680/files/06711644..7244d349 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=17-18 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From sspitsyn at openjdk.org Fri Mar 1 11:10:57 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Mar 2024 11:10:57 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v3] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Thu, 29 Feb 2024 04:17:09 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. >> With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. >> >> The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. >> >> CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly >> RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly >> >> Testing: >> - tested with the mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge > - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method > - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly PING!! Need a couple of reviews, please. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17944#issuecomment-1972989702 From sspitsyn at openjdk.org Fri Mar 1 11:17:21 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Mar 2024 11:17:21 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v20] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - Merge - review: update comment in threads.hpp - fix deadlock with carrier threads starvation in ObjectMonitorUsage test - resolve merge conflict for deleted file objmonusage003.cpp - fix a typo in libObjectMonitorUsage.cpp - fix potential sync gap in the test ObjectMonitorUsage - improve ObjectMonitorUsage test native agent output - improved the ObjectMonitorUsage test to make it more elegant - review: remove test objmonusage003; improve test ObjectMonitorUsage - review: addressed minor issue with use of []; corrected the test desctiption - ... and 12 more: https://git.openjdk.org/jdk/compare/e85265ab...f1a97f51 ------------- Changes: https://git.openjdk.org/jdk/pull/17680/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=19 Stats: 1123 lines in 15 files changed: 686 ins; 389 del; 48 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From sspitsyn at openjdk.org Fri Mar 1 11:19:21 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Mar 2024 11:19:21 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v4] In-Reply-To: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: > The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. > With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. > > The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. > > CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly > RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > Testing: > - tested with the mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge - Merge - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17944/files - new: https://git.openjdk.org/jdk/pull/17944/files/a4e8c5b5..1a3d1c19 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=02-03 Stats: 2399 lines in 780 files changed: 475 ins; 436 del; 1488 mod Patch: https://git.openjdk.org/jdk/pull/17944.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17944/head:pull/17944 PR: https://git.openjdk.org/jdk/pull/17944 From sspitsyn at openjdk.org Fri Mar 1 11:50:05 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Mar 2024 11:50:05 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: rename after merge: jvmti_common.h to jvmti_common.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17680/files - new: https://git.openjdk.org/jdk/pull/17680/files/f1a97f51..b449f04a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=19-20 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From azeller at openjdk.org Fri Mar 1 12:28:12 2024 From: azeller at openjdk.org (Arno Zeller) Date: Fri, 1 Mar 2024 12:28:12 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux Message-ID: After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases ------------- Commit messages: - Fix comment as requested by mbaesken - Merge branch 'openjdk:master' into JDK-8327071 - JDK-8327071 Changes: https://git.openjdk.org/jdk/pull/18077/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18077&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327071 Stats: 10 lines in 1 file changed: 9 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18077.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18077/head:pull/18077 PR: https://git.openjdk.org/jdk/pull/18077 From mbaesken at openjdk.org Fri Mar 1 12:28:12 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 1 Mar 2024 12:28:12 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases Looks okay to me ; while at it please fix also the typo in comment line in this file ( 'generaed by' ) . ------------- Marked as reviewed by mbaesken (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18077#pullrequestreview-1910985208 From aph at openjdk.org Fri Mar 1 12:41:55 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Mar 2024 12:41:55 GMT Subject: RFR: JDK-8326974: ODR violation in macroAssembler_aarch64.cpp [v3] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 23:17:54 GMT, Dean Long wrote: > How about making Decoder into a nested class of MacroAssembler? You just need a forward declaration in macroAssembler_aarch64.hpp. That would have been an excellent thing to do, and I didn't think about it, but please let this patch stand. We've spent enough time on it already, I think. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18056#issuecomment-1973116747 From aph at openjdk.org Fri Mar 1 12:41:55 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Mar 2024 12:41:55 GMT Subject: Integrated: JDK-8326974: ODR violation in macroAssembler_aarch64.cpp In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 09:47:11 GMT, Andrew Haley wrote: > This is a slightly different patch from the one suggested by the bug reporter. It doesn't make any sense to export those classes globally, so I've given them internal linkage. This pull request has now been integrated. Changeset: b972997a Author: Andrew Haley URL: https://git.openjdk.org/jdk/commit/b972997af76a506ffd79ee8c6043e7a8db836b33 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod 8326974: ODR violation in macroAssembler_aarch64.cpp Reviewed-by: adinn, shade, gli ------------- PR: https://git.openjdk.org/jdk/pull/18056 From mli at openjdk.org Fri Mar 1 13:35:59 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 1 Mar 2024 13:35:59 GMT Subject: RFR: 8327058: RISC-V: make Zcb experimental [v2] In-Reply-To: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> References: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> Message-ID: <-cnyNr7ZuNE_gQZD5vVX3bm5HoMu-LXjiYwVf0ZNDng=.7f85a1a4-f81e-448a-a237-bd0e4f5c781b@github.com> > Hi, > Can you review the patch to add a flag for Zcb extension to enable user to control it? > Thanks. > > As discussed in [1], it's good to give a flag to control Zcb, and currently it's good to make it experimental. > > [1] https://github.com/openjdk/jdk/pull/17698#discussion_r1505364098 Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: group experimental flags ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18070/files - new: https://git.openjdk.org/jdk/pull/18070/files/500a6ea2..81acfdb3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18070&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18070&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18070.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18070/head:pull/18070 PR: https://git.openjdk.org/jdk/pull/18070 From mli at openjdk.org Fri Mar 1 13:35:59 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 1 Mar 2024 13:35:59 GMT Subject: RFR: 8327058: RISC-V: make Zcb experimental [v2] In-Reply-To: References: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> Message-ID: On Fri, 1 Mar 2024 10:37:45 GMT, Guoxiong Li wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> group experimental flags > > Generally, a new option needs a CSR. But I notice that many options added in riscv-port didn't have their CSRs, so I would be OK without CSR. @lgxbslgx @RealFYang Thanks for your reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18070#issuecomment-1973211927 From mli at openjdk.org Fri Mar 1 13:35:59 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 1 Mar 2024 13:35:59 GMT Subject: Integrated: 8327058: RISC-V: make Zcb experimental In-Reply-To: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> References: <-vwMZiS-BUC2ZR2u5eMlnC9lkEghpJebOJs8TNxik8Q=.851c72e9-34ac-4bb3-b0bc-c4421da7672e@github.com> Message-ID: On Thu, 29 Feb 2024 17:31:52 GMT, Hamlin Li wrote: > Hi, > Can you review the patch to add a flag for Zcb extension to enable user to control it? > Thanks. > > As discussed in [1], it's good to give a flag to control Zcb, and currently it's good to make it experimental. > > [1] https://github.com/openjdk/jdk/pull/17698#discussion_r1505364098 This pull request has now been integrated. Changeset: c02e7f4b Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/c02e7f4bb5ebd4765f0b58586932d75cba1144b4 Stats: 3 lines in 3 files changed: 1 ins; 0 del; 2 mod 8327058: RISC-V: make Zcb experimental Reviewed-by: fyang, gli ------------- PR: https://git.openjdk.org/jdk/pull/18070 From ihse at openjdk.org Fri Mar 1 13:39:51 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Mar 2024 13:39:51 GMT Subject: RFR: 8327049: Stop exporting debug.cpp functions on gcc [v2] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 13:55:56 GMT, Magnus Ihse Bursie wrote: >> This is a partial backout of [JDK-8326509](https://bugs.openjdk.org/browse/JDK-8326509). It turned out that starting to export the functions from debug.cpp could have unintended consequences in certain circumstances, so revert back to the original situation where these functions were only exported on Windows. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Use WINDOWS_ONLY instead My takeaway from the problem we encountered was that contrary to the assumption in [JDK-8326509](https://bugs.openjdk.org/browse/JDK-8326509), it is better *not* to export these functions on all platforms, but only do so where it is required by the debugger, i.e. on Windows. In fact, since we are still exporting them on Windows, we can still run into the same kind on conflict -- we have just limited the exposure surface to Windows-only. But there seems to be no way around this due to limitations in the Visual Studio debugger. :-( So this will restore the functionality that has been for a long time -- the debug functions are only exported on Windows. (Previously, this was achieved by marking the functions JNIEXPORT, but leaving them out from the symbol files, which made them non-exported on all compilers except visual studio). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18062#issuecomment-1973220653 From ihse at openjdk.org Fri Mar 1 13:42:51 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Mar 2024 13:42:51 GMT Subject: RFR: 8327049: Stop exporting debug.cpp functions on gcc [v2] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 21:18:54 GMT, David Holmes wrote: > I had expected to see an actual reversion to the undef of JNIEXPORT but this will do albeit more extensive a change. To me, `#undef` is really the worst kind of hack that should be used only when you do not have full control over the source code where it is defined. Arbitrarily removal of macro definitions makes it much harder to reason about the source code. It was put in place just to get ahead of the mapfile removal with minimal changes, and the intention to remove it as soon as possible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18062#issuecomment-1973224766 From fyang at openjdk.org Fri Mar 1 13:49:55 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 1 Mar 2024 13:49:55 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Wed, 28 Feb 2024 14:19:06 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? >> Thanks! >> >> ## Test >> >> test/jdk/java/lang/Float/Binary16ConversionNaN.java >> test/jdk/java/lang/Float/Binary16Conversion.java >> >> hotspot/jtreg/compiler/intrinsics/float16 >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > make UseZvfh experimatal; clean code src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2003: > 2001: // i.e. non-NaN and non-Inf cases. > 2002: > 2003: vsetvli_helper(BasicType::T_SHORT, length, Assembler::mf2); Will this `vsetvli` setting work when we check for NaNs in L2005-L2010? The `Assembler::mf2` param doesn't seem correct to me. Should we use `Assembler::m1` here instead? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1509035057 From epeter at openjdk.org Fri Mar 1 14:19:55 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 1 Mar 2024 14:19:55 GMT Subject: RFR: 8327049: Only export debug.cpp functions on Windows [v2] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 13:55:56 GMT, Magnus Ihse Bursie wrote: >> This is a partial backout of [JDK-8326509](https://bugs.openjdk.org/browse/JDK-8326509). It turned out that starting to export the functions from debug.cpp could have unintended consequences in certain circumstances, so revert back to the original situation where these functions were only exported on Windows. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Use WINDOWS_ONLY instead > unintended consequences in certain circumstances Can you say what these are? Could not find any info about that in any related JBS or GitHub places. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18062#issuecomment-1973284651 From stuefe at openjdk.org Fri Mar 1 14:58:57 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 1 Mar 2024 14:58:57 GMT Subject: RFR: 8324829: Uniform use of synchronizations in NMT In-Reply-To: References: Message-ID: <4Gk2RIvpPrEvCJjp9iv4REP2kKFka8Tkcz4irVpoK2Y=.b2c9a1e2-ae2f-40e6-823b-f7d616c42931@github.com> On Mon, 26 Feb 2024 07:41:45 GMT, Afshin Zafari wrote: > In NMT, both `ThreadCritical` and `Tracker` objects are used to do synchronizations. `Tracker` class uses `ThreadCritical` internally and used only for uncommit and release memory operations. In this change, `Tracker` class is replaced with explicit use of `ThreadCritical` to be the same as the other instances of sync'ing NMT operations. > > tiers1-5 tests passed. Sure. Ship it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18000#issuecomment-1973343520 From mli at openjdk.org Fri Mar 1 15:01:00 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 1 Mar 2024 15:01:00 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 09:30:01 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Fix potential attribute issue Just a quick update, I've rebased the code, and will continue the work soon. I've looked through the previous discussion, seems there is no blocking issues, if I misunderstood or miss some information, please feel free to let me know, also if you have any further comment. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1973345855 From epeter at openjdk.org Fri Mar 1 15:08:43 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 1 Mar 2024 15:08:43 GMT Subject: RFR: 8327049: Only export debug.cpp functions on Windows [v2] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 13:55:56 GMT, Magnus Ihse Bursie wrote: >> This is a partial backout of [JDK-8326509](https://bugs.openjdk.org/browse/JDK-8326509). It turned out that starting to export the functions from debug.cpp could have unintended consequences in certain circumstances, so revert back to the original situation where these functions were only exported on Windows. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Use WINDOWS_ONLY instead Ah ok. Sounds reasonable, thanks for the explanation! ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18062#pullrequestreview-1911376258 From ihse at openjdk.org Fri Mar 1 15:08:43 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Mar 2024 15:08:43 GMT Subject: RFR: 8327049: Only export debug.cpp functions on Windows [v2] In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 14:17:08 GMT, Emanuel Peter wrote: > Can you say what these are? Could not find any info about that in any related JBS or GitHub places. We spotted a problem with an Oracle-internal testing tool that exported a function named `debug`, which ended up calling the exported `debug` function from Hotspot instead. But the problem could theoretically apply to any code out there that links with libjvm.so and has conflicting symbols defined. Normally, the symbols exported from Hotspot are named `JVM_` to minimize the risk of this theoretical conflict becoming real, but the functions in debug.cpp has much shorter and more common names (for simple usage in the debugger), so the risk of actual conflicts are much higher. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18062#issuecomment-1973351820 From ihse at openjdk.org Fri Mar 1 15:12:59 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Mar 2024 15:12:59 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 09:30:01 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Fix potential attribute issue Iirc, your assessment is right; the code should be ready for integration; I just don't know about the state of testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1973364027 From ihse at openjdk.org Fri Mar 1 15:13:54 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Mar 2024 15:13:54 GMT Subject: Integrated: 8327049: Only export debug.cpp functions on Windows In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 12:50:33 GMT, Magnus Ihse Bursie wrote: > This is a partial backout of [JDK-8326509](https://bugs.openjdk.org/browse/JDK-8326509). It turned out that starting to export the functions from debug.cpp could have unintended consequences in certain circumstances, so revert back to the original situation where these functions were only exported on Windows. This pull request has now been integrated. Changeset: b38a6c57 Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/b38a6c5780611f02d02215c65340b725e4c18929 Stats: 33 lines in 1 file changed: 3 ins; 0 del; 30 mod 8327049: Only export debug.cpp functions on Windows Reviewed-by: jwaters, dholmes, epeter ------------- PR: https://git.openjdk.org/jdk/pull/18062 From gcao at openjdk.org Fri Mar 1 15:37:12 2024 From: gcao at openjdk.org (Gui Cao) Date: Fri, 1 Mar 2024 15:37:12 GMT Subject: RFR: 8319900: Recursive lightweight locking: riscv64 implementation [v4] In-Reply-To: References: Message-ID: > Implementation of recursive lightweight JDK-8319796 for linux-riscv64. > > ### Correctness testing: > > - [x] Run tier1-3, hotspot:tier4 tests with -XX:LockingMode=1 & -XX:LockingMode=2 on LicheePI 4A (release) > - [x] Run tier1-3 tests with -XX:LockingMode=1 & -XX:LockingMode=2 on qemu 8.1.0 (fastdebug) > > ### JMH tested on LicheePI 4A: > > > Before (LockingMode = 2): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 281.603 ? 2.626 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 196373.917 ? 4669.241 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 274.492 ? 6.049 ns/op > LockUnlock.testSerialLockUnlock 100 avgt 12 11516.092 ? 69.964 ns/op > LockUnlock.testSimpleLockUnlock 100 avgt 12 11490.789 ? 58.833 ns/op > > After (LockingMode = 2): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 307.013 ? 3.259 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 59047.222 ? 94.748 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 259.789 ? 1.698 ns/op > LockUnlock.testSerialLockUnlock 100 avgt 12 6055.775 ? 13.518 ns/op > LockUnlock.testSimpleLockUnlock 100 avgt 12 6090.986 ? 67.179 ns/op > > > > Before (LockingMode = 1): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 286.144 ? 2.490 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 68058.259 ? 286.262 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 263.132 ? 4.075 ns/op > LockUnlock.testSerialLockUnlock 100 avgt 12 7521.777 ? 35.193 ns/op > LockUnlock.testSimpleLockUnlock 100 avgt 12 7522.480 ? 26.310 ns/op > > After (LockingMode = 1): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 289.034 ? 5.821 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 68474.370 ? 495.135 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 261.052 ? 4.560 ns/op > Loc... Gui Cao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into JDK-8319900 - Merge remote-tracking branch 'upstream/master' into JDK-8319900 - fix comment typo - Merge branch 'master' into JDK-8319900 - improve code - Polish code comment - 8319900: Recursive lightweight locking: riscv64 implementation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17554/files - new: https://git.openjdk.org/jdk/pull/17554/files/34acbb27..3f506562 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17554&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17554&range=02-03 Stats: 2965 lines in 822 files changed: 774 ins; 489 del; 1702 mod Patch: https://git.openjdk.org/jdk/pull/17554.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17554/head:pull/17554 PR: https://git.openjdk.org/jdk/pull/17554 From sviswanathan at openjdk.org Fri Mar 1 17:04:55 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 1 Mar 2024 17:04:55 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Fri, 1 Mar 2024 08:15:50 GMT, Jatin Bhateja wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Update description of Poly1305 algo > > src/hotspot/cpu/x86/assembler_x86.cpp line 5148: > >> 5146: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); >> 5147: InstructionMark im(this); >> 5148: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); > > uses_vl should be false here. > > BTW, this assertion looks very fuzzy, you are checking for two target features in one instruction, apparently, instruction is meant to use AVX512_IFMA only for 512 bit vector length, and for narrower vectors its needs AVX_IFMA. > > Lets either keep this strictly for AVX_IFMA for AVX512_IFMA we already have evpmadd52[l/h]uq, if you truly want to make this generic one then split the assertion > > `assert ( (avx_ifma && vector_len <= 256) || (avx512_ifma && (vector_len == 512 || VM_Version::support_vl())); > ` > > And then you may pass uses_vl at true. It would be good to make this instruction generic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1509271081 From rrich at openjdk.org Fri Mar 1 17:35:55 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 1 Mar 2024 17:35:55 GMT Subject: RFR: 8319901: Recursive lightweight locking: ppc64le implementation In-Reply-To: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> References: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> Message-ID: On Fri, 10 Nov 2023 13:21:58 GMT, Axel Boldt-Christmas wrote: > Draft implementation of recursive lightweight JDK-8319796 for ppc64le. > > Porters feel free to discard this and implement something from scratch, take this initial draft and run with it, or send patches to me and if you want me to drive the PR. > > Some notes: > ppc64le deviates from the other ports, it shares it C2 locking code with the native call wrapper. This draft unifies the behavior across the ports by using the C1/interpreter locking for native call wrapper. If it is important to have a fast path for the inflated monitor case for native call wrapper, it could be done without having it share its implementation with C2. > It would also be possible to change this behavior on all ports (share the C2 code), as I believe the C2 assumptions always hold for the native call wrapper, the monitor exit and monitor enter will be balanced and structured. This looks good to me. (just found a bunch of typos you might want to fix) Thanks, Richard. src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2474: > 2472: > 2473: #ifdef ASSERT > 2474: // Check that locked label is reached with flags == EQ. Suggestion: // Check that locked label is reached with flag == EQ. src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2481: > 2479: bind(slow_path); > 2480: #ifdef ASSERT > 2481: // Check that slow_path label is reached with flags == NE. Suggestion: // Check that slow_path label is reached with flag == NE. src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2630: > 2628: > 2629: #ifdef ASSERT > 2630: // Check that unlocked label is reached with flags == EQ. Suggestion: // Check that unlocked label is reached with flag == EQ. src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2637: > 2635: bind(slow_path); > 2636: #ifdef ASSERT > 2637: // Check that slow_path label is reached with flags == NE. Suggestion: // Check that slow_path label is reached with flag == NE. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16611#pullrequestreview-1911152553 PR Review Comment: https://git.openjdk.org/jdk/pull/16611#discussion_r1508996979 PR Review Comment: https://git.openjdk.org/jdk/pull/16611#discussion_r1508999295 PR Review Comment: https://git.openjdk.org/jdk/pull/16611#discussion_r1509261672 PR Review Comment: https://git.openjdk.org/jdk/pull/16611#discussion_r1509261987 From sspitsyn at openjdk.org Fri Mar 1 18:13:55 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Mar 2024 18:13:55 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v8] In-Reply-To: <225C3AcEAOR7Zd--eHOBiqFHVv6h3Vpqet9TvXM1t08=.a086aa88-bb89-4d2d-9186-9a22eab00db4@github.com> References: <225C3AcEAOR7Zd--eHOBiqFHVv6h3Vpqet9TvXM1t08=.a086aa88-bb89-4d2d-9186-9a22eab00db4@github.com> Message-ID: On Wed, 14 Feb 2024 21:13:56 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: added assert to get_pending_threads; added suggested coverage to test objmonusage003 > > CHECK_FOR_BAD_RESULTS and ADD_DELAYS_FOR_RACES flags and > their associated code paths are for temporary debugging only. @dcubed-ojdk @dholmes-ora Dan and David, I think, I've addressed all your comments. Do you have any other concerns? Can you approve it now? Does the RN look okay? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17680#issuecomment-1973668452 From iklam at openjdk.org Sat Mar 2 01:22:16 2024 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 2 Mar 2024 01:22:16 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java Message-ID: A few clean ups: 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. ------------- Commit messages: - 8327138: Clean up status management in cdsConfig.hpp and CDS.java Changes: https://git.openjdk.org/jdk/pull/18095/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327138 Stats: 191 lines in 19 files changed: 43 ins; 53 del; 95 mod Patch: https://git.openjdk.org/jdk/pull/18095.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18095/head:pull/18095 PR: https://git.openjdk.org/jdk/pull/18095 From gli at openjdk.org Sat Mar 2 05:46:51 2024 From: gli at openjdk.org (Guoxiong Li) Date: Sat, 2 Mar 2024 05:46:51 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases Looks good. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18077#pullrequestreview-1912583549 From stuefe at openjdk.org Sat Mar 2 08:30:52 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 2 Mar 2024 08:30:52 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases Hi @ArnoZeller, I wonder why these files have to live in tmp. Why can't they not live in the current directory, that way if we run gtest via jtreg tests, they would be cleaned automatically, and be subject to `-retain` if needed. @jdksjolen do you know why ? Other than that, if they have to live in temp, the patch is of course fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18077#issuecomment-1974705490 From jsjolen at openjdk.org Sat Mar 2 19:19:51 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Sat, 2 Mar 2024 19:19:51 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: <0xKcfdhsQMYXk_cKsTWNwarQkjBRwgv6xVaj827yRLg=.c8262d9b-f1ac-4ab4-836b-174ad0a512e5@github.com> On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases Hi, The reasoning behind that is simple: I was not aware that JTReg deletes files created in the current directory. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18077#issuecomment-1974883885 From duke at openjdk.org Sun Mar 3 20:33:00 2024 From: duke at openjdk.org (Elif Aslan) Date: Sun, 3 Mar 2024 20:33:00 GMT Subject: RFR: 8327125: SpinYield.report should report microseconds Message-ID: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> Changes to report microseconds instead of milliseconds. GHA tested and also ran tier 2 tests. ------------- Commit messages: - 8327125: SpinYield.report should report microseconds Changes: https://git.openjdk.org/jdk/pull/18099/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18099&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327125 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18099.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18099/head:pull/18099 PR: https://git.openjdk.org/jdk/pull/18099 From dholmes at openjdk.org Mon Mar 4 02:53:45 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Mar 2024 02:53:45 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:50:05 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > rename after merge: jvmti_common.h to jvmti_common.hpp > The notify_waiters has to include all threads waiting to be notified in Object.wait(). > The implementation also includes the threads waiting to re-enter the monitor in Object.wait() which is wrong. This is not in fact the case so can you please remove this from the problem statement/description above. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17680#issuecomment-1975561224 From dholmes at openjdk.org Mon Mar 4 03:54:54 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Mar 2024 03:54:54 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:50:05 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > rename after merge: jvmti_common.h to jvmti_common.hpp Thanks Serguei, this is generally looking good, but I have one main query with the logic, a number of suggestions regarding comments, and a few issues in the new test code. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1489: > 1487: assert(mon != nullptr, "must have monitor"); > 1488: // this object has a heavyweight monitor > 1489: nWant = mon->contentions(); // # of threads contending for monitor Please clarity comment as // # of threads contending for monitor entry, but not -rentry src/hotspot/share/prims/jvmtiEnvBase.cpp line 1490: > 1488: // this object has a heavyweight monitor > 1489: nWant = mon->contentions(); // # of threads contending for monitor > 1490: nWait = mon->waiters(); // # of threads in Object.wait() Please clarify the comment as // # of threads waiting for notification, or to re-enter monitor, in Object.wait() src/hotspot/share/prims/jvmtiEnvBase.cpp line 1491: > 1489: nWant = mon->contentions(); // # of threads contending for monitor > 1490: nWait = mon->waiters(); // # of threads in Object.wait() > 1491: wantList = Threads::get_pending_threads(tlh.list(), nWant + nWait, (address)mon); Please add a comment // Get the actual set of threads trying to enter, or re-enter, the monitor. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1505: > 1503: // the monitor threads after being notified. Here we are correcting the actual > 1504: // number of the waiting threads by excluding those re-entering the monitor. > 1505: nWait = i; Sorry I can't follow the logic here. Why is the whole block not simply: if (mon != nullptry) { // The nWait count may include threads trying to re-enter the monitor, so // walk the actual set of waiters to count those actually waiting. int count = 0; for (ObjectMonitor * waiter = mon->first_waiter(); waiter != nullptr; waiter = mon->next_waiter(waiter)) { count++; } } ret.notify_waiter_count = count; src/hotspot/share/prims/jvmtiEnvBase.cpp line 1552: > 1550: // If the thread was found on the ObjectWaiter list, then > 1551: // it has not been notified. This thread can't change the > 1552: // state of the monitor so it doesn't need to be suspended. Not sure why suspension is being mentioned here. test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 31: > 29: * DESCRIPTION > 30: * The test checks if the JVMTI function GetObjectMonitorUsage returns > 31: * the expected values for the owner, entry_count, water_count s/water_count/waiter_count/ test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 40: > 38: * - owned object with N waitings to be notified > 39: * - owned object with N waitings to enter, from 0 to N waitings to re-enter, > 40: * from N to 0 waitings to be notified "waitings" is not grammatically valid. Suggestion: s/waitings/threads waiting/ test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 42: > 40: * from N to 0 waitings to be notified > 41: * - all the above scenarios are executed with platform and virtual threads > 42: * @requires vm.continuations Do we really require this? test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 132: > 130: // count of threads waiting to re-enter: 0 > 131: // count of threads waiting to be notified: NUMBER_OF_WAITING_THREADS > 132: check(lockCheck, null, 0, // no owner thread AFAICS you are racing here, as the waiting threads can set `ready` but not yet have called `wait()`. To know for sure they are in `wait()` you need to acquire the monitor before checking (and release it again so that you test the no-owner scenario as intended). But this should probably be done inside `startWaitingThreads`. test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 319: > 317: try { > 318: ready = true; > 319: lockCheck.wait(); Spurious wakeups will break this test. They shouldn't happen but ... test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 65: > 63: > 64: > 65: jint Agent_Initialize(JavaVM *jvm, char *options, void *reserved) { Nit: there's a double space after jint test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 219: > 217: } > 218: > 219: } // exnern "C" typo: exnern -> extern ------------- PR Review: https://git.openjdk.org/jdk/pull/17680#pullrequestreview-1913304471 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510543479 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510540981 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510544636 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510548391 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510549323 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510550756 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510551339 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510551516 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510556424 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510553511 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510557225 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510557802 From gli at openjdk.org Mon Mar 4 04:54:51 2024 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 4 Mar 2024 04:54:51 GMT Subject: RFR: 8327125: SpinYield.report should report microseconds In-Reply-To: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> References: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> Message-ID: On Sun, 3 Mar 2024 20:29:11 GMT, Elif Aslan wrote: > Changes to report microseconds instead of milliseconds. > GHA tested and also ran tier 2 tests. LGTM. But I am not a formal reviewer, so please wait a formal reviewer to approve this patch. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18099#pullrequestreview-1913370150 From dholmes at openjdk.org Mon Mar 4 06:00:42 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Mar 2024 06:00:42 GMT Subject: RFR: 8327125: SpinYield.report should report microseconds In-Reply-To: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> References: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> Message-ID: On Sun, 3 Mar 2024 20:29:11 GMT, Elif Aslan wrote: > Changes to report microseconds instead of milliseconds. > GHA tested and also ran tier 2 tests. Looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18099#pullrequestreview-1913428015 From dholmes at openjdk.org Mon Mar 4 07:13:55 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Mar 2024 07:13:55 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v4] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Fri, 1 Mar 2024 11:19:21 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. >> With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. >> >> The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. >> >> CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly >> RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly >> >> Testing: >> - tested with the mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge > - Merge > - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method > - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly The main functional fix seems fine. The JDWP and JDI spec changes seem fine - just a minor nit on wording. I don't know enough about how the tests work to comment on those. src/hotspot/share/prims/jvmtiEnvBase.cpp line 941: > 939: bool is_virtual = java_lang_VirtualThread::is_instance(thread_oop); > 940: jint state = is_virtual ? JvmtiEnvBase::get_vthread_state(thread_oop, java_thread) > 941: : JvmtiEnvBase::get_thread_state(thread_oop, java_thread); I've seen this in a couple of your PRs now. It would be nice to have a utility function to get the JVMTI thread state of any java.lang.Thread instance. src/java.se/share/data/jdwp/jdwp.spec line 1985: > 1983: "thread may be waiting to enter the object's monitor, or in " > 1984: "java.lang.Object.wait waiting to re-enter the monitor after being " > 1985: "notified, interrupted, or timeout." `timed-out` would be the correct sense here. src/jdk.jdi/share/classes/com/sun/jdi/ThreadReference.java line 314: > 312: * synchronized method, the synchronized statement, or > 313: * {@link Object#wait} waiting to re-enter the monitor > 314: * after being notified, interrupted, or timeout. Again `timed-out`. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17944#pullrequestreview-1913493587 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1510673055 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1510665704 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1510666362 From aboldtch at openjdk.org Mon Mar 4 07:15:09 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Mar 2024 07:15:09 GMT Subject: RFR: 8319901: Recursive lightweight locking: ppc64le implementation [v2] In-Reply-To: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> References: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> Message-ID: > Draft implementation of recursive lightweight JDK-8319796 for ppc64le. > > Porters feel free to discard this and implement something from scratch, take this initial draft and run with it, or send patches to me and if you want me to drive the PR. > > Some notes: > ppc64le deviates from the other ports, it shares it C2 locking code with the native call wrapper. This draft unifies the behavior across the ports by using the C1/interpreter locking for native call wrapper. If it is important to have a fast path for the inflated monitor case for native call wrapper, it could be done without having it share its implementation with C2. > It would also be possible to change this behavior on all ports (share the C2 code), as I believe the C2 assumptions always hold for the native call wrapper, the monitor exit and monitor enter will be balanced and structured. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Richard Reingruber ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16611/files - new: https://git.openjdk.org/jdk/pull/16611/files/6eec9a2e..68ca8b65 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16611&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16611&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16611.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16611/head:pull/16611 PR: https://git.openjdk.org/jdk/pull/16611 From aboldtch at openjdk.org Mon Mar 4 07:22:54 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Mar 2024 07:22:54 GMT Subject: RFR: 8319901: Recursive lightweight locking: ppc64le implementation In-Reply-To: References: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> Message-ID: On Wed, 21 Feb 2024 08:44:48 GMT, Martin Doerr wrote: >> Draft implementation of recursive lightweight JDK-8319796 for ppc64le. >> >> Porters feel free to discard this and implement something from scratch, take this initial draft and run with it, or send patches to me and if you want me to drive the PR. >> >> Some notes: >> ppc64le deviates from the other ports, it shares it C2 locking code with the native call wrapper. This draft unifies the behavior across the ports by using the C1/interpreter locking for native call wrapper. If it is important to have a fast path for the inflated monitor case for native call wrapper, it could be done without having it share its implementation with C2. >> It would also be possible to change this behavior on all ports (share the C2 code), as I believe the C2 assumptions always hold for the native call wrapper, the monitor exit and monitor enter will be balanced and structured. > >> > Thanks a lot! LGTM. >> > Regarding aarch64: Should we file a JBS issue for the usage of `box`? >> > https://github.com/openjdk/jdk/blob/d31fd78d963d5d103b1b1bf66ae0bdbe4be2b790/src/hotspot/cpu/aarch64/aarch64.ad#L16470 >> > >> > has no kill effect for `box`, but `C2_MacroAssembler::fast_lock_lightweight` kills it. It may be never used, but it's a trap someone might run into. x86_64 uses `USE_KILL box`. >> > I think that it currently works, but I'd prefer to use a temp register and hopefully remove the `box` completely at some point of time. It could also be cleaned up together with other changes if you prefer that. >> >> This came up in the review as well. The issue with `USE_KILL` and `KILL` is that they require a bound register to use. So we must restrict the box to a specific real register. (Works on x86 because it box is bound to `RAX`/`EAX`). >> >> Should create a JBS issue for this. And maybe it is to dangerous to leave it as is until we can create a C2 Node that does not require the box. >> >> The only potential fix I see is to actually bind the register. i.e. >> >> ```diff >> diff --git a/src/hotspot/cpu/aarch64/aarch64.ad b/src/hotspot/cpu/aarch64/aarch64.ad >> index a07ff041c48..97d7898ceba 100644 >> --- a/src/hotspot/cpu/aarch64/aarch64.ad >> +++ b/src/hotspot/cpu/aarch64/aarch64.ad >> @@ -16463,14 +16463,14 @@ instruct cmpFastUnlock(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRe >> ins_pipe(pipe_serial); >> %} >> >> -instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRegPNoSp tmp2) >> +instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP_R0 box, iRegPNoSp tmp, iRegPNoSp tmp2) >> %{ >> predicate(LockingMode == LM_LIGHTWEIGHT); >> match(Set cr (FastLock object box)); >> - effect(TEMP tmp, TEMP tmp2); >> + effect(TEMP tmp, TEMP tmp2, USE_KILL box); >> >> ins_cost(5 * INSN_COST); >> - format %{ "fastlock $object,$box\t! kills $tmp,$tmp2" %} >> + format %{ "fastlock $object,$box\t! kills $tmp,$tmp2,$box" %} >> >> ins_encode %{ >> __ fast_lock_lightweight($object$$Register, $box$$Register, $tmp$$Register, $tmp2$$Register); >> @@ -16479,14 +16479,14 @@ instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp >> ins_pipe(pipe_serial); >> %} >> >> -instruct cmpFastUnlockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRegPNoSp tmp2) >> +instruct cmpFastUnlockLightweight(rFlagsReg cr, iRegP object, iRegP_R0 box, iRegPNoSp tmp, iRegPNoSp tmp2) >> %{... @TheRealMDoerr and @reinrich I will delegate the integration. As the required reviews are in, I will leave any final testing to you. Feel free to integrate if everything looks green. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16611#issuecomment-1975879325 From azeller at openjdk.org Mon Mar 4 07:30:52 2024 From: azeller at openjdk.org (Arno Zeller) Date: Mon, 4 Mar 2024 07:30:52 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Sat, 2 Mar 2024 08:28:31 GMT, Thomas Stuefe wrote: >> After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. >> >> cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine >> cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases >> cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases > > Hi @ArnoZeller, I wonder why these files have to live in tmp. Why can't they not live in the current directory, that way if we run gtest via jtreg tests, they would be cleaned automatically, and be subject to `-retain` if needed. @jdksjolen do you know why ? > > Other than that, if they have to live in temp, the patch is of course fine. Hi @tstuefe, I must admit that I do not know what is normally expected from GTests. - Should they cleanup in case of no error? (I guess yes) - Where should GTests normally write temporary files? - Does everybody run GTests only with JTReg or are they normally executed manually? (I myself always use JTReg, but only because I am used to JTReg). Probably we could move the test to write to the current directory but I am not sure if this is a good idea when the test is not executed by JTReg. Perhaps someone with more GTests knowledge could assist here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18077#issuecomment-1975890045 From fyang at openjdk.org Mon Mar 4 08:02:48 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 4 Mar 2024 08:02:48 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Wed, 28 Feb 2024 14:19:06 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? >> Thanks! >> >> ## Test >> >> test/jdk/java/lang/Float/Binary16ConversionNaN.java >> test/jdk/java/lang/Float/Binary16Conversion.java >> >> hotspot/jtreg/compiler/intrinsics/float16 >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > make UseZvfh experimatal; clean code Several more comments. Could you please move the newly-added assember functions after the scalar versions, that is `C2_MacroAssembler::float16_to_float` and `C2_MacroAssembler::float_to_float16`? Then it will be more convenient for people to compare the two versions. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1979: > 1977: > 1978: // widen and sign-extend src data. > 1979: __ vwadd_vx(dst, src, zr, Assembler::v0_t); Can we use vector integer extension instruction here instead of `vwadd_vx`? Like this one from the RVV spec: `vsext.vf2 vd, vs2, vm # Sign-extend SEW/2 source to SEW destination`. This seems more readable and we can do the lines L1980-l1981 on entry, which has the benefit of not relying on the caller's vsetvli for operations here like `vwadd_vx`. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1995: > 1993: auto stub = C2CodeStub::make > 1994: (dst, src, length, 24, float16_to_float_v_slow_path); > 1995: I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1996: > 1994: (dst, src, length, 24, float16_to_float_v_slow_path); > 1995: > 1996: // in riscv, NaN needs a special process as vfwcvt_f_f_v does not work in that case. Suggestion: s/in riscv/On riscv/ src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1997: > 1995: > 1996: // in riscv, NaN needs a special process as vfwcvt_f_f_v does not work in that case. > 1997: // in riscv, Inf does not need a special process as vfwcvt_f_f_v can handle it correctly. Suggestion: s/in riscv/On riscv/ src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2052: > 2050: void C2_MacroAssembler::float_to_float16_v(VectorRegister dst, VectorRegister src, VectorRegister tmp, uint length) { > 2051: auto stub = C2CodeStub::make > 2052: (dst, src, tmp, length, 36, float_to_float16_v_slow_path); Similar here: I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2054: > 2052: (dst, src, tmp, length, 36, float_to_float16_v_slow_path); > 2053: > 2054: // in riscv, NaN needs a special process as vfncvt_f_f_w does not work in that case. Suggestion: s/in riscv/On riscv/ ------------- PR Review: https://git.openjdk.org/jdk/pull/17698#pullrequestreview-1913566853 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510711417 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510713401 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510715794 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510715892 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510714094 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510716096 From mbaesken at openjdk.org Mon Mar 4 08:09:52 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 4 Mar 2024 08:09:52 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases looks like a couple of other gtests use the tmp dir too grep -nH -r get_temp_directory * logging/logTestFixture.cpp:40: os::get_temp_directory(), os::file_separator(), os::current_process_id(), logging/logTestUtils.inline.hpp:93:static const char* tmp_dir = os::get_temp_directory(); logging/test_logFileOutput.cpp:173: ss.print("%s%s%s%d", os::get_temp_directory(), os::file_separator(), "tmplogdir", os::current_process_id()); os/linux/test_cgroupSubsystem_linux.cpp:45: path.print_raw(os::get_temp_directory()); but seems the others at least clean up the created files as far as I can see. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18077#issuecomment-1975945801 From azafari at openjdk.org Mon Mar 4 08:20:55 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 4 Mar 2024 08:20:55 GMT Subject: RFR: 8324829: Uniform use of synchronizations in NMT In-Reply-To: <4Gk2RIvpPrEvCJjp9iv4REP2kKFka8Tkcz4irVpoK2Y=.b2c9a1e2-ae2f-40e6-823b-f7d616c42931@github.com> References: <4Gk2RIvpPrEvCJjp9iv4REP2kKFka8Tkcz4irVpoK2Y=.b2c9a1e2-ae2f-40e6-823b-f7d616c42931@github.com> Message-ID: <2XT8n07lkMBu_H2BjdAQiP29PxeYjvgCyP2i50v-1sw=.a65e09ad-7fd4-4c4d-8711-094c7f8ca277@github.com> On Fri, 1 Mar 2024 14:56:35 GMT, Thomas Stuefe wrote: >> In NMT, both `ThreadCritical` and `Tracker` objects are used to do synchronizations. `Tracker` class uses `ThreadCritical` internally and used only for uncommit and release memory operations. In this change, `Tracker` class is replaced with explicit use of `ThreadCritical` to be the same as the other instances of sync'ing NMT operations. >> >> tiers1-5 tests passed. > > Sure. Ship it. Thank you @tstuefe and @jdksjolen for your comments. Thank you @kevinrushforth and @vnkozlov for your catch and help on fixing the issue number. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18000#issuecomment-1975969142 From azafari at openjdk.org Mon Mar 4 08:20:55 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Mon, 4 Mar 2024 08:20:55 GMT Subject: Integrated: 8324829: Uniform use of synchronizations in NMT In-Reply-To: References: Message-ID: On Mon, 26 Feb 2024 07:41:45 GMT, Afshin Zafari wrote: > In NMT, both `ThreadCritical` and `Tracker` objects are used to do synchronizations. `Tracker` class uses `ThreadCritical` internally and used only for uncommit and release memory operations. In this change, `Tracker` class is replaced with explicit use of `ThreadCritical` to be the same as the other instances of sync'ing NMT operations. > > tiers1-5 tests passed. This pull request has now been integrated. Changeset: 7c71f188 Author: Afshin Zafari URL: https://git.openjdk.org/jdk/commit/7c71f188a3a98e47ee363c532bd75937e69869a7 Stats: 73 lines in 7 files changed: 16 ins; 40 del; 17 mod 8324829: Uniform use of synchronizations in NMT Reviewed-by: stuefe, jsjolen ------------- PR: https://git.openjdk.org/jdk/pull/18000 From shade at openjdk.org Mon Mar 4 08:30:43 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Mar 2024 08:30:43 GMT Subject: RFR: 8327125: SpinYield.report should report microseconds In-Reply-To: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> References: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> Message-ID: On Sun, 3 Mar 2024 20:29:11 GMT, Elif Aslan wrote: > Changes to report microseconds instead of milliseconds. > GHA tested and also ran tier 2 tests. Looks good and trivial. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18099#pullrequestreview-1913663447 From stuefe at openjdk.org Mon Mar 4 08:31:54 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Mar 2024 08:31:54 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases Marked as reviewed by stuefe (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18077#pullrequestreview-1913667336 From stuefe at openjdk.org Mon Mar 4 08:31:54 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Mar 2024 08:31:54 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: <7zj-PUPTlLO6sM1mxp0ScKb01xriUYM3CfDOcSI7XSw=.4b3285bf-04ed-4a9f-b532-718b6856be21@github.com> On Mon, 4 Mar 2024 08:07:14 GMT, Matthias Baesken wrote: > looks like a couple of other gtests use the tmp dir too > > ``` > grep -nH -r get_temp_directory * > logging/logTestFixture.cpp:40: os::get_temp_directory(), os::file_separator(), os::current_process_id(), > logging/logTestUtils.inline.hpp:93:static const char* tmp_dir = os::get_temp_directory(); > logging/test_logFileOutput.cpp:173: ss.print("%s%s%s%d", os::get_temp_directory(), os::file_separator(), "tmplogdir", os::current_process_id()); > os/linux/test_cgroupSubsystem_linux.cpp:45: path.print_raw(os::get_temp_directory()); > ``` > > but seems the others at least clean up the created files as far as I can see. Okay, if it is established practice, let's do it here too. When in Rome... What I like most about the current-directory solution is the ability to keep files with -retain, for post-test analysis. But if we change this, we should change it for all tests, and that is beyond the scope of this RFR. Cheers, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/18077#issuecomment-1975994439 From azeller at openjdk.org Mon Mar 4 08:38:54 2024 From: azeller at openjdk.org (Arno Zeller) Date: Mon, 4 Mar 2024 08:38:54 GMT Subject: RFR: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases Thanks to all for reviewing! @MBaesken: Would you please sponsor it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18077#issuecomment-1976008923 From azeller at openjdk.org Mon Mar 4 08:43:55 2024 From: azeller at openjdk.org (Arno Zeller) Date: Mon, 4 Mar 2024 08:43:55 GMT Subject: Integrated: JDK-8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 11:00:01 GMT, Arno Zeller wrote: > After each run of the g-tests for cgroups on Linux there are three new files left in /tmp. > > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsSingleLine > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesSuccessCases > cgroups-test-jdk.pid.cgroupTest.SubSystemFileLineContentsMultipleLinesErrorCases This pull request has now been integrated. Changeset: e889b460 Author: Arno Zeller Committer: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/e889b460c03b3887ec5477fa734c430d3c3a41c8 Stats: 10 lines in 1 file changed: 9 ins; 0 del; 1 mod 8327071: [Testbug] g-tests for cgroup leave files in /tmp on linux Reviewed-by: mbaesken, gli, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/18077 From kbarrett at openjdk.org Mon Mar 4 08:46:12 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 4 Mar 2024 08:46:12 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL Message-ID: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Please review this change to update the HotSpot Style Guide's discussion of nullptr and its use. I suggest this is an editorial rather than substantive change to the style guide. As such, the normal HotSpot PR process can be used for this change. ------------- Commit messages: - update nullptr usage Changes: https://git.openjdk.org/jdk/pull/18101/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18101&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327173 Stats: 19 lines in 2 files changed: 10 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18101.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18101/head:pull/18101 PR: https://git.openjdk.org/jdk/pull/18101 From kbarrett at openjdk.org Mon Mar 4 08:46:12 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 4 Mar 2024 08:46:12 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL In-Reply-To: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Mon, 4 Mar 2024 08:38:09 GMT, Kim Barrett wrote: > Please review this change to update the HotSpot Style Guide's discussion of > nullptr and its use. > > I suggest this is an editorial rather than substantive change to the style > guide. As such, the normal HotSpot PR process can be used for this change. doc/hotspot-style.md line 730: > 728: Use `nullptr` > 729: ([n2431](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf)) > 730: rather than `NULL`. Don't use (constant expression or literal) 0 for pointers. Should there be any discussion here of why we want to avoid `NULL`? Specifically the varying definitions and the pitfalls of `NULL` in varargs context. Also template and overload issues. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18101#discussion_r1510773801 From gcao at openjdk.org Mon Mar 4 08:59:54 2024 From: gcao at openjdk.org (Gui Cao) Date: Mon, 4 Mar 2024 08:59:54 GMT Subject: RFR: 8319900: Recursive lightweight locking: riscv64 implementation [v4] In-Reply-To: <0FqyJDzVzUmzIMVTWE5_QLPVt-QknkZlKGxKwiOcH5E=.d915ec2a-75ba-4832-b9e2-a812f87ca73c@github.com> References: <0FqyJDzVzUmzIMVTWE5_QLPVt-QknkZlKGxKwiOcH5E=.d915ec2a-75ba-4832-b9e2-a812f87ca73c@github.com> Message-ID: On Mon, 26 Feb 2024 05:10:55 GMT, Fei Yang wrote: >> Gui Cao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge remote-tracking branch 'upstream/master' into JDK-8319900 >> - Merge remote-tracking branch 'upstream/master' into JDK-8319900 >> - fix comment typo >> - Merge branch 'master' into JDK-8319900 >> - improve code >> - Polish code comment >> - 8319900: Recursive lightweight locking: riscv64 implementation > > This looks pretty good. Thanks. @RealFYang : Thanks for your review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17554#issuecomment-1976047709 From sroy at openjdk.org Mon Mar 4 09:06:03 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Mon, 4 Mar 2024 09:06:03 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v3] In-Reply-To: References: Message-ID: > Remove redefinition of NULL for XLC compiler > In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). > Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. > > > JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) Suchismith Roy has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: remove AIX check for NULL ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18064/files - new: https://git.openjdk.org/jdk/pull/18064/files/eb8faf93..21fd7a2c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=01-02 Stats: 226 lines in 1 file changed: 226 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18064.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18064/head:pull/18064 PR: https://git.openjdk.org/jdk/pull/18064 From galder at openjdk.org Mon Mar 4 09:09:03 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 4 Mar 2024 09:09:03 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: References: Message-ID: <04QTriAjxYzeREOBd1Kka9Rq2yZpYZTweZ7qMkpHjfg=.4b4a0181-13d6-4e11-a384-cfe2e05ebe72@github.com> On Thu, 22 Feb 2024 16:08:33 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.hpp line 100: > 98: // len : array length in number of elements > 99: // t : scratch register - contents destroyed > 100: // header_size: size of object header in words @rkennke Should this be updated to `base_offset_in_bytes` too? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1510812835 From galder at openjdk.org Mon Mar 4 09:09:03 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 4 Mar 2024 09:09:03 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: <04QTriAjxYzeREOBd1Kka9Rq2yZpYZTweZ7qMkpHjfg=.4b4a0181-13d6-4e11-a384-cfe2e05ebe72@github.com> References: <04QTriAjxYzeREOBd1Kka9Rq2yZpYZTweZ7qMkpHjfg=.4b4a0181-13d6-4e11-a384-cfe2e05ebe72@github.com> Message-ID: On Mon, 4 Mar 2024 09:04:34 GMT, Galder Zamarre?o wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment > > src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.hpp line 100: > >> 98: // len : array length in number of elements >> 99: // t : scratch register - contents destroyed >> 100: // header_size: size of object header in words > > @rkennke Should this be updated to `base_offset_in_bytes` too? And description updated? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1510813503 From galder at openjdk.org Mon Mar 4 09:09:03 2024 From: galder at openjdk.org (Galder =?UTF-8?B?WmFtYXJyZcOxbw==?=) Date: Mon, 4 Mar 2024 09:09:03 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: References: <04QTriAjxYzeREOBd1Kka9Rq2yZpYZTweZ7qMkpHjfg=.4b4a0181-13d6-4e11-a384-cfe2e05ebe72@github.com> Message-ID: <0uHUk2SuaffnE1qPZbX27sRU1c3IgXRTG6RKdHbrvac=.9294bf2f-fee9-4a62-9d48-ea882c481982@github.com> On Mon, 4 Mar 2024 09:04:57 GMT, Galder Zamarre?o wrote: >> src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.hpp line 100: >> >>> 98: // len : array length in number of elements >>> 99: // t : scratch register - contents destroyed >>> 100: // header_size: size of object header in words >> >> @rkennke Should this be updated to `base_offset_in_bytes` too? > > And description updated? Same of x86 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1510816287 From sspitsyn at openjdk.org Mon Mar 4 09:53:55 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 09:53:55 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v4] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Mon, 4 Mar 2024 07:06:31 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge >> - Merge >> - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method >> - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 941: > >> 939: bool is_virtual = java_lang_VirtualThread::is_instance(thread_oop); >> 940: jint state = is_virtual ? JvmtiEnvBase::get_vthread_state(thread_oop, java_thread) >> 941: : JvmtiEnvBase::get_thread_state(thread_oop, java_thread); > > I've seen this in a couple of your PRs now. It would be nice to have a utility function to get the JVMTI thread state of any java.lang.Thread instance. Good suggestion, thanks. Will add. > src/java.se/share/data/jdwp/jdwp.spec line 1985: > >> 1983: "thread may be waiting to enter the object's monitor, or in " >> 1984: "java.lang.Object.wait waiting to re-enter the monitor after being " >> 1985: "notified, interrupted, or timeout." > > `timed-out` would be the correct sense here. Thank you, David. I was thinking about it but was not sure. Will update it. > src/jdk.jdi/share/classes/com/sun/jdi/ThreadReference.java line 314: > >> 312: * synchronized method, the synchronized statement, or >> 313: * {@link Object#wait} waiting to re-enter the monitor >> 314: * after being notified, interrupted, or timeout. > > Again `timed-out`. Will fix, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1510890499 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1510887118 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1510887956 From shade at openjdk.org Mon Mar 4 09:54:52 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Mar 2024 09:54:52 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL In-Reply-To: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Mon, 4 Mar 2024 08:38:09 GMT, Kim Barrett wrote: > Please review this change to update the HotSpot Style Guide's discussion of > nullptr and its use. > > I suggest this is an editorial rather than substantive change to the style > guide. As such, the normal HotSpot PR process can be used for this change. Looks okay, but questions/suggestions: doc/hotspot-style.md line 738: > 736: expressions with value zero. C++14 replaced that with integer literals with > 737: value zero. Some compilers continue to treat a zero-valued integral constant > 738: expressions as a _null pointer constant_ even in C++14 mode. It is not very clear why should this concern any Hotspot developers? ------------- PR Review: https://git.openjdk.org/jdk/pull/18101#pullrequestreview-1913874746 PR Review Comment: https://git.openjdk.org/jdk/pull/18101#discussion_r1510891459 From shade at openjdk.org Mon Mar 4 09:54:53 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Mar 2024 09:54:53 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Mon, 4 Mar 2024 08:41:46 GMT, Kim Barrett wrote: >> Please review this change to update the HotSpot Style Guide's discussion of >> nullptr and its use. >> >> I suggest this is an editorial rather than substantive change to the style >> guide. As such, the normal HotSpot PR process can be used for this change. > > doc/hotspot-style.md line 730: > >> 728: Use `nullptr` >> 729: ([n2431](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf)) >> 730: rather than `NULL`. Don't use (constant expression or literal) 0 for pointers. > > Should there be any discussion here of why we want to avoid `NULL`? Specifically the varying > definitions and the pitfalls of `NULL` in varargs context. Also template and overload issues. I think it would be enough to write 1..2 sentences about this, and then defer to N2431 already linked here for more details. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18101#discussion_r1510890143 From sspitsyn at openjdk.org Mon Mar 4 09:58:56 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 09:58:56 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 03:07:58 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> rename after merge: jvmti_common.h to jvmti_common.hpp > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1489: > >> 1487: assert(mon != nullptr, "must have monitor"); >> 1488: // this object has a heavyweight monitor >> 1489: nWant = mon->contentions(); // # of threads contending for monitor > > Please clarify comment as > > // # of threads contending for monitor entry, but not re-entry > > (Fixed typos - sorry) Will fix, thanks. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1490: > >> 1488: // this object has a heavyweight monitor >> 1489: nWant = mon->contentions(); // # of threads contending for monitor >> 1490: nWait = mon->waiters(); // # of threads in Object.wait() > > Please clarify the comment as > > // # of threads waiting for notification, or to re-enter monitor, in Object.wait() Good catch, thanks. Will fix. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1491: > >> 1489: nWant = mon->contentions(); // # of threads contending for monitor >> 1490: nWait = mon->waiters(); // # of threads in Object.wait() >> 1491: wantList = Threads::get_pending_threads(tlh.list(), nWant + nWait, (address)mon); > > Please add a comment > > // Get the actual set of threads trying to enter, or re-enter, the monitor. Will add, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510895824 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510894066 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510896692 From sspitsyn at openjdk.org Mon Mar 4 10:05:47 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 10:05:47 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 03:20:41 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> rename after merge: jvmti_common.h to jvmti_common.hpp > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1505: > >> 1503: // the monitor threads after being notified. Here we are correcting the actual >> 1504: // number of the waiting threads by excluding those re-entering the monitor. >> 1505: nWait = i; > > Sorry I can't follow the logic here. Why is the whole block not simply: > > if (mon != nullptry) { > // The nWait count may include threads trying to re-enter the monitor, so > // walk the actual set of waiters to count those actually waiting. > int count = 0; > for (ObjectMonitor * waiter = mon->first_waiter(); waiter != nullptr; waiter = mon->next_waiter(waiter)) { > count++; > } > } > ret.notify_waiter_count = count; Thank you for the suggestion. Let me check this. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1552: > >> 1550: // If the thread was found on the ObjectWaiter list, then >> 1551: // it has not been notified. This thread can't change the >> 1552: // state of the monitor so it doesn't need to be suspended. > > Not sure why suspension is being mentioned here. This is the original comment. I also do not quite understand it. It is confusing for sure. I can remove it if Dan does not object. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510906350 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510904664 From sspitsyn at openjdk.org Mon Mar 4 10:12:01 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 10:12:01 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 03:41:55 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> rename after merge: jvmti_common.h to jvmti_common.hpp > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 132: > >> 130: // count of threads waiting to re-enter: 0 >> 131: // count of threads waiting to be notified: NUMBER_OF_WAITING_THREADS >> 132: check(lockCheck, null, 0, // no owner thread > > AFAICS you are racing here, as the waiting threads can set `ready` but not yet have called `wait()`. To know for sure they are in `wait()` you need to acquire the monitor before checking (and release it again so that you test the no-owner scenario as intended). But this should probably be done inside `startWaitingThreads`. Thank you for the comment. Will implement similar approach as in the `startEnteringThreads()`. > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 319: > >> 317: try { >> 318: ready = true; >> 319: lockCheck.wait(); > > Spurious wakeups will break this test. They shouldn't happen but ... Thanks, will check how this can be fixed. > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 65: > >> 63: >> 64: >> 65: jint Agent_Initialize(JavaVM *jvm, char *options, void *reserved) { > > Nit: there's a double space after jint Will fix, thanks, > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 219: > >> 217: } >> 218: >> 219: } // exnern "C" > > typo: exnern -> extern Will fix, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510912362 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510913702 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510909857 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510909512 From sspitsyn at openjdk.org Mon Mar 4 10:20:56 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 10:20:56 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 03:29:01 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> rename after merge: jvmti_common.h to jvmti_common.hpp > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 42: > >> 40: * from N to 0 waitings to be notified >> 41: * - all the above scenarios are executed with platform and virtual threads >> 42: * @requires vm.continuations > > Do we really require this? Yes, can be removed. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510925088 From fyang at openjdk.org Mon Mar 4 10:24:53 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 4 Mar 2024 10:24:53 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: <3beY-tiq5c4FkyFlwWJa2JD1N9VjSTvZjxTZTVii0kE=.0c0f2671-4083-4eb0-bae1-dc2db97cd60a@github.com> On Wed, 28 Feb 2024 14:19:06 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? >> Thanks! >> >> ## Test >> >> test/jdk/java/lang/Float/Binary16ConversionNaN.java >> test/jdk/java/lang/Float/Binary16Conversion.java >> >> hotspot/jtreg/compiler/intrinsics/float16 >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > make UseZvfh experimatal; clean code src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2037: > 2035: __ mv(t0, 26); > 2036: // preserve the sign bit. > 2037: __ vnsra_wx(tmp, src, t0, Assembler::v0_t); Will this more simpler version do here? `__ vnsra_wi(tmp, src, 26, Assembler::v0_t);` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510930483 From sspitsyn at openjdk.org Mon Mar 4 10:38:55 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 10:38:55 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 03:26:58 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> rename after merge: jvmti_common.h to jvmti_common.hpp > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 31: > >> 29: * DESCRIPTION >> 30: * The test checks if the JVMTI function GetObjectMonitorUsage returns >> 31: * the expected values for the owner, entry_count, water_count > > s/water_count/waiter_count/ Thanks, fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510947433 From sspitsyn at openjdk.org Mon Mar 4 11:01:46 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 4 Mar 2024 11:01:46 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 03:28:28 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> rename after merge: jvmti_common.h to jvmti_common.hpp > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 40: > >> 38: * - owned object with N waitings to be notified >> 39: * - owned object with N waitings to enter, from 0 to N waitings to re-enter, >> 40: * from N to 0 waitings to be notified > > "waitings" is not grammatically valid. Suggestion: s/waitings/threads waiting/ Thanks, fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1510976777 From mli at openjdk.org Mon Mar 4 11:55:55 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 4 Mar 2024 11:55:55 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 07:50:53 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> make UseZvfh experimatal; clean code > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1979: > >> 1977: >> 1978: // widen and sign-extend src data. >> 1979: __ vwadd_vx(dst, src, zr, Assembler::v0_t); > > Can we use vector integer extension instruction here instead of `vwadd_vx`? Like this one from the RVV spec: > `vsext.vf2 vd, vs2, vm # Sign-extend SEW/2 source to SEW destination`. > This seems more readable and we can do the lines L1980-l1981 on entry, which has the benefit of not relying on the caller's vsetvli for operations here like `vwadd_vx`. Yes, modified. Thanks! > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1995: > >> 1993: auto stub = C2CodeStub::make >> 1994: (dst, src, length, 24, float16_to_float_v_slow_path); >> 1995: > > I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers. we could, but it's not necessary, as in `riscv_v.ad` it already has: instruct vconvHF2F(vReg dst, vReg src, vRegMask_V0 v0) %{ ... effect(TEMP_DEF dst, TEMP v0); > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2003: > >> 2001: // i.e. non-NaN and non-Inf cases. >> 2002: >> 2003: vsetvli_helper(BasicType::T_SHORT, length, Assembler::mf2); > > Will this `vsetvli` setting work when we check for NaNs in L2005-L2010? The `Assembler::mf2` param doesn't seem correct to me. Should we use `Assembler::m1` here instead? I think it will work. And please note that we set it to `m1` in `float16_to_float_v_slow_path` in case of any NaN. Or maybe I miss something? > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2037: > >> 2035: __ mv(t0, 26); >> 2036: // preserve the sign bit. >> 2037: __ vnsra_wx(tmp, src, t0, Assembler::v0_t); > > Will this more simpler version do here? `__ vnsra_wi(tmp, src, 26, Assembler::v0_t);` Seems not, as by vector spec, `vector-immediate | imm[4:0]` and `The value is sign-extended to SEW bits`, so maximum of imm should be 16-1. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511037729 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511037873 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511037578 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511038094 From mli at openjdk.org Mon Mar 4 12:01:27 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 4 Mar 2024 12:01:27 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v4] In-Reply-To: References: Message-ID: <5NLcNb1LAJ6HVNHRSbdWALWxoz6kGwX3VqeaQ0mpnas=.ed8c236f-0e4b-4f25-9fac-639151299b21@github.com> > Hi, > Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? > Thanks! > > ## Test > > test/jdk/java/lang/Float/Binary16ConversionNaN.java > test/jdk/java/lang/Float/Binary16Conversion.java > > hotspot/jtreg/compiler/intrinsics/float16 > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java Hamlin Li has updated the pull request incrementally with three additional commits since the last revision: - refine code - refine code according to review comments - move code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17698/files - new: https://git.openjdk.org/jdk/pull/17698/files/0dce1573..08dce191 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=02-03 Stats: 196 lines in 1 file changed: 96 ins; 93 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17698.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17698/head:pull/17698 PR: https://git.openjdk.org/jdk/pull/17698 From fbredberg at openjdk.org Mon Mar 4 12:39:12 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 4 Mar 2024 12:39:12 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe Message-ID: The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 [6.693s][trace][continuations] local 0 [6.693s][trace][continuations] oop for #10 [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 [6.693s][trace][continuations] oop for #10 [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 [6.693s][trace][continuations] oop for #10 [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 [6.693s][trace][continuations] - 3 locals 3 max stack [6.693s][trace][continuations] - codelet: return entry points [6.693s][trace][continuations] saved fp [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror [6.693s][trace][continuations] oop for #10 [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) [6.693s][trace][continuations] 0x000000409e14cc18: 0x0000004050401a7a interpreter_frame_bcp [6.693s][trace][continuations] 0x000000409e14cc10: 0x000000409e14cc10 interpreter_frame_initial_sp (relativized: fp-12) [6.693s][trace][continuations] 0x000000409e14cc08: 0x00000000eec34548 [6.693s][trace][continuations] 0x000000409e14cc00: 0x0000000000000000 locals for #9 [6.693s][trace][continuations] local 0 [6.693s][trace][continuations] oop for #9 [6.693s][trace][continuations] 0x000000409e14cbf8: 0x00000000eecf1370 local 1 [6.693s][trace][continuations] oop for #9 [6.693s][trace][continuations] 0x000000409e14cbf0: 0x00000000eecf1380 local 2 [6.693s][trace][continuations] oop for #9 [6.693s][trace][continuations] 0x000000409e14cbe8: 0x0000000000000000 local 3 [6.693s][trace][continuations] 0x000000409e14cbe0: 0x0000000000000000 local 4 [6.693s][trace][continuations] unextended_sp for #10 The output is was created by running `test/jdk/jdk/internal/vm/Continuation/Basic.java` with `-Xlog:continuations=trace:frbr_log.txt`. Output has been reviewed for x64, aarch64, ppc64le and riscv64. The changes also passes tier1-tier5. ------------- Commit messages: - 8314508: Improve how relativized pointers are printed by frame::describe Changes: https://git.openjdk.org/jdk/pull/18102/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18102&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8314508 Stats: 10 lines in 1 file changed: 6 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18102.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18102/head:pull/18102 PR: https://git.openjdk.org/jdk/pull/18102 From mdoerr at openjdk.org Mon Mar 4 16:04:01 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 4 Mar 2024 16:04:01 GMT Subject: RFR: 8319901: Recursive lightweight locking: ppc64le implementation [v2] In-Reply-To: References: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> Message-ID: On Mon, 4 Mar 2024 07:15:09 GMT, Axel Boldt-Christmas wrote: >> Draft implementation of recursive lightweight JDK-8319796 for ppc64le. >> >> Porters feel free to discard this and implement something from scratch, take this initial draft and run with it, or send patches to me and if you want me to drive the PR. >> >> Some notes: >> ppc64le deviates from the other ports, it shares it C2 locking code with the native call wrapper. This draft unifies the behavior across the ports by using the C1/interpreter locking for native call wrapper. If it is important to have a fast path for the inflated monitor case for native call wrapper, it could be done without having it share its implementation with C2. >> It would also be possible to change this behavior on all ports (share the C2 code), as I believe the C2 assumptions always hold for the native call wrapper, the monitor exit and monitor enter will be balanced and structured. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Richard Reingruber Thanks! All tests are green. Let's ship it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16611#issuecomment-1976915091 From aboldtch at openjdk.org Mon Mar 4 16:04:01 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Mar 2024 16:04:01 GMT Subject: Integrated: 8319901: Recursive lightweight locking: ppc64le implementation In-Reply-To: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> References: <3LduAyBU7SzwJawH73jO5cAjVr5I0HJzs8y_jO9r3N4=.42388751-25b0-43fe-97c3-c4f95334bcfc@github.com> Message-ID: On Fri, 10 Nov 2023 13:21:58 GMT, Axel Boldt-Christmas wrote: > Draft implementation of recursive lightweight JDK-8319796 for ppc64le. > > Porters feel free to discard this and implement something from scratch, take this initial draft and run with it, or send patches to me and if you want me to drive the PR. > > Some notes: > ppc64le deviates from the other ports, it shares it C2 locking code with the native call wrapper. This draft unifies the behavior across the ports by using the C1/interpreter locking for native call wrapper. If it is important to have a fast path for the inflated monitor case for native call wrapper, it could be done without having it share its implementation with C2. > It would also be possible to change this behavior on all ports (share the C2 code), as I believe the C2 assumptions always hold for the native call wrapper, the monitor exit and monitor enter will be balanced and structured. This pull request has now been integrated. Changeset: b5cd7efc Author: Axel Boldt-Christmas Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/b5cd7efcebe0daaf8a85f0f32b65a3bd446674ef Stats: 509 lines in 9 files changed: 409 ins; 65 del; 35 mod 8319901: Recursive lightweight locking: ppc64le implementation Co-authored-by: Martin Doerr Reviewed-by: mdoerr, rrich ------------- PR: https://git.openjdk.org/jdk/pull/16611 From dchuyko at openjdk.org Mon Mar 4 17:36:19 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Mon, 4 Mar 2024 17:36:19 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v28] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 46 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Deopt osr, cleanups - ... and 36 more: https://git.openjdk.org/jdk/compare/59529a92...a4578277 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=27 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From duke at openjdk.org Mon Mar 4 17:53:57 2024 From: duke at openjdk.org (Elif Aslan) Date: Mon, 4 Mar 2024 17:53:57 GMT Subject: Integrated: 8327125: SpinYield.report should report microseconds In-Reply-To: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> References: <_Knh7nkXBGOfvvN9d7Mdgp3Q6zZE-ph2QBLdH5ayqU8=.97374b43-605e-4fdb-8726-8b55810e0646@github.com> Message-ID: <06tokXeeALe7wF4NsJo_5w0S2UYMxJQiHk8RsLUaPrk=.7bcfa880-83c8-4dcb-abcb-7f86d0936366@github.com> On Sun, 3 Mar 2024 20:29:11 GMT, Elif Aslan wrote: > Changes to report microseconds instead of milliseconds. > GHA tested and also ran tier 2 tests. This pull request has now been integrated. Changeset: 8cfacebd Author: Elif Aslan Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/8cfacebd06da3a45d119b5378ce0c2dd591d2442 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8327125: SpinYield.report should report microseconds Reviewed-by: gli, dholmes, shade ------------- PR: https://git.openjdk.org/jdk/pull/18099 From kvn at openjdk.org Mon Mar 4 18:03:53 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 4 Mar 2024 18:03:53 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Mon, 4 Mar 2024 09:51:16 GMT, Aleksey Shipilev wrote: >> doc/hotspot-style.md line 730: >> >>> 728: Use `nullptr` >>> 729: ([n2431](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf)) >>> 730: rather than `NULL`. Don't use (constant expression or literal) 0 for pointers. >> >> Should there be any discussion here of why we want to avoid `NULL`? Specifically the varying >> definitions and the pitfalls of `NULL` in varargs context. Also template and overload issues. > > I think it would be enough to write 1..2 sentences about this, and then defer to N2431 already linked here for more details. I agree with Aleksey. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18101#discussion_r1511561200 From sviswanathan at openjdk.org Mon Mar 4 20:24:56 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Mon, 4 Mar 2024 20:24:56 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Fri, 1 Mar 2024 06:09:30 GMT, Jatin Bhateja wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Update description of Poly1305 algo > > src/hotspot/cpu/x86/assembler_x86.cpp line 9115: > >> 9113: >> 9114: void Assembler::vpunpcklqdq(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { >> 9115: assert(UseAVX > 0, "requires some form of AVX"); > > Add appropriate AVX512VL assertion The VL assertion is already being done as part of vex_prefix_and_encode() and vex_prefix() so no need to add it here. That's why we don't have this assertion in any of the AVX instructions which are promotable to EVEX e.g. vpadd, vpsub, etc. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1511747533 From hohensee at amazon.com Mon Mar 4 21:15:19 2024 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 4 Mar 2024 21:15:19 +0000 Subject: =?utf-8?B?UmU6IENGVjogTmV3IEhvdFNwb3QgR3JvdXAgTWVtYmVyOiBKb2hhbiBTasO2?= =?utf-8?B?bMOpbg==?= Message-ID: Vote: yes ?On 2/22/24, 4:15 AM, "hotspot-dev on behalf of Jesper Wilhelmsson" on behalf of jesper.wilhelmsson at oracle.com > wrote: I hereby nominate Johan Sj?l?n (jsjolen) to Membership in the HotSpot Group. Johan is a Reviewer in the JDK project, and a member of the Oracle JVM Runtime team. Johan has done major work in the Unified Logging framework and is today the de facto owner of that area. He has done major cleanups and bug fixes throughout the JVM, and does a lot of work in the Native Memory Tracking area. Votes are due by Thursday March 7. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote From duke at openjdk.org Mon Mar 4 21:40:04 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 4 Mar 2024 21:40:04 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v11] In-Reply-To: References: Message-ID: > The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. > > This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) > > This PR shows upto 19x speedup on buffer sizes of 1MB. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: update asserts for vpmadd52l/hq ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17881/files - new: https://git.openjdk.org/jdk/pull/17881/files/b869d874..4a74a773 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17881&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17881&range=09-10 Stats: 8 lines in 1 file changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17881.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17881/head:pull/17881 PR: https://git.openjdk.org/jdk/pull/17881 From duke at openjdk.org Mon Mar 4 21:40:05 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 4 Mar 2024 21:40:05 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: <_6dorzq67KAZsTBHBvbQRDi_xW70bFhJudnxbG88m6I=.33e06bd5-d5fc-4ba8-b740-437155d567cf@github.com> On Fri, 1 Mar 2024 17:02:35 GMT, Sandhya Viswanathan wrote: >> src/hotspot/cpu/x86/assembler_x86.cpp line 5148: >> >>> 5146: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); >>> 5147: InstructionMark im(this); >>> 5148: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); >> >> uses_vl should be false here. >> >> BTW, this assertion looks very fuzzy, you are checking for two target features in one instruction, apparently, instruction is meant to use AVX512_IFMA only for 512 bit vector length, and for narrower vectors its needs AVX_IFMA. >> >> Lets either keep this strictly for AVX_IFMA for AVX512_IFMA we already have evpmadd52[l/h]uq, if you truly want to make this generic one then split the assertion >> >> `assert ( (avx_ifma && vector_len <= 256) || (avx512_ifma && (vector_len == 512 || VM_Version::support_vl())); >> ` >> >> And then you may pass uses_vl at true. > > It would be good to make this instruction generic. Please see the updated assert as suggested for vpmadd52[l/h]uq in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1511830438 From duke at openjdk.org Mon Mar 4 21:40:05 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 4 Mar 2024 21:40:05 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> Message-ID: On Fri, 1 Mar 2024 08:16:38 GMT, Jatin Bhateja wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> Update description of Poly1305 algo > > src/hotspot/cpu/x86/assembler_x86.cpp line 5157: > >> 5155: void Assembler::vpmadd52luq(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { >> 5156: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); >> 5157: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); > > uses_vl should be false. Please see the updated assert as suggested for vpmadd52[l/h]uq in the latest commit. > src/hotspot/cpu/x86/assembler_x86.cpp line 5183: > >> 5181: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); >> 5182: InstructionMark im(this); >> 5183: InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true); > > uses_vl should be false. Please see the updated assert as suggested for vpmadd52[l/h]uq in the latest commit. > src/hotspot/cpu/x86/assembler_x86.cpp line 5191: > >> 5189: >> 5190: void Assembler::vpmadd52huq(XMMRegister dst, XMMRegister src1, XMMRegister src2, int vector_len) { >> 5191: assert(vector_len == AVX_512bit ? VM_Version::supports_avx512ifma() : VM_Version::supports_avxifma(), ""); > > Same as above. Please see the updated assert as suggested for vpmadd52[l/h]uq in the latest commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1511830567 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1511830720 PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1511830942 From matsaave at openjdk.org Mon Mar 4 21:46:20 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 4 Mar 2024 21:46:20 GMT Subject: RFR: 8327093: Add slice function to BitMap API Message-ID: The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. This implementation has two core methods: `copy_of_range` allocates and returns a new word array as used internally by BitMap `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. ------------- Commit messages: - Adjusted assert - Added tests for CHeap and Arena - Improved test and corrected copy_of_range - 8327093: Add slice function to BitMap API Changes: https://git.openjdk.org/jdk/pull/18092/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327093 Stats: 293 lines in 3 files changed: 291 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18092/head:pull/18092 PR: https://git.openjdk.org/jdk/pull/18092 From jzhu at openjdk.org Tue Mar 5 00:15:34 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Tue, 5 Mar 2024 00:15:34 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers In-Reply-To: References: Message-ID: On Wed, 28 Feb 2024 14:49:24 GMT, Stuart Monteith wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > That is a welcome change - in light of possibly very large registers, we don't want to save more than is necessary. > It looks OK, but have you run a specific test that demonstrates it working? @stooart-mon Thanks for your review! > That is a welcome change - in light of possibly very large registers, we don't want to save more than is necessary. It looks OK, but have you run a specific test that demonstrates it working? Yes. I wrote several cases to test against this commit to ensure the quality. [TestZGCSpillingAtLoadBarrierStub.java](https://github.com/JoshuaZhuwj/openjdk_cases/blob/master/8326541/TestZGCSpillingAtLoadBarrierStub.java) Assembly and OptoAssembly outputs before the change: https://github.com/JoshuaZhuwj/openjdk_cases/blob/master/8326541/output_before_change.log Outputs after the change: https://github.com/JoshuaZhuwj/openjdk_cases/blob/master/8326541/output_after_change.log ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-1970924457 From sspitsyn at openjdk.org Tue Mar 5 01:48:48 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 01:48:48 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 10:02:03 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1552: >> >>> 1550: // If the thread was found on the ObjectWaiter list, then >>> 1551: // it has not been notified. This thread can't change the >>> 1552: // state of the monitor so it doesn't need to be suspended. >> >> Not sure why suspension is being mentioned here. > > This is the original comment. I also do not quite understand it. It is confusing for sure. I can remove it if Dan does not object. Removed this statement from the comment. Will restore in a case of Dan's objection. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1512020517 From gcao at openjdk.org Tue Mar 5 02:06:49 2024 From: gcao at openjdk.org (Gui Cao) Date: Tue, 5 Mar 2024 02:06:49 GMT Subject: Integrated: 8319900: Recursive lightweight locking: riscv64 implementation In-Reply-To: References: Message-ID: On Wed, 24 Jan 2024 14:09:21 GMT, Gui Cao wrote: > Implementation of recursive lightweight JDK-8319796 for linux-riscv64. > > ### Correctness testing: > > - [x] Run tier1-3, hotspot:tier4 tests with -XX:LockingMode=1 & -XX:LockingMode=2 on LicheePI 4A (release) > - [x] Run tier1-3 tests with -XX:LockingMode=1 & -XX:LockingMode=2 on qemu 8.1.0 (fastdebug) > > ### JMH tested on LicheePI 4A: > > > Before (LockingMode = 2): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 281.603 ? 2.626 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 196373.917 ? 4669.241 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 274.492 ? 6.049 ns/op > LockUnlock.testSerialLockUnlock 100 avgt 12 11516.092 ? 69.964 ns/op > LockUnlock.testSimpleLockUnlock 100 avgt 12 11490.789 ? 58.833 ns/op > > After (LockingMode = 2): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 307.013 ? 3.259 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 59047.222 ? 94.748 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 259.789 ? 1.698 ns/op > LockUnlock.testSerialLockUnlock 100 avgt 12 6055.775 ? 13.518 ns/op > LockUnlock.testSimpleLockUnlock 100 avgt 12 6090.986 ? 67.179 ns/op > > > > Before (LockingMode = 1): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 286.144 ? 2.490 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 68058.259 ? 286.262 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 263.132 ? 4.075 ns/op > LockUnlock.testSerialLockUnlock 100 avgt 12 7521.777 ? 35.193 ns/op > LockUnlock.testSimpleLockUnlock 100 avgt 12 7522.480 ? 26.310 ns/op > > After (LockingMode = 1): > > Benchmark (innerCount) Mode Cnt Score Error Units > LockUnlock.testContendedLock 100 avgt 12 289.034 ? 5.821 ns/op > LockUnlock.testRecursiveLockUnlock 100 avgt 12 68474.370 ? 495.135 ns/op > LockUnlock.testRecursiveSynchronization 100 avgt 12 261.052 ? 4.560 ns/op > Loc... This pull request has now been integrated. Changeset: e1b661f8 Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/e1b661f8c1df780cce28fe76d257b44e2fe44058 Stats: 613 lines in 9 files changed: 411 ins; 114 del; 88 mod 8319900: Recursive lightweight locking: riscv64 implementation Co-authored-by: Axel Boldt-Christmas Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/17554 From fyang at openjdk.org Tue Mar 5 03:33:46 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 5 Mar 2024 03:33:46 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 11:52:54 GMT, Hamlin Li wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2037: >> >>> 2035: __ mv(t0, 26); >>> 2036: // preserve the sign bit. >>> 2037: __ vnsra_wx(tmp, src, t0, Assembler::v0_t); >> >> Will this more simpler version do here? `__ vnsra_wi(tmp, src, 26, Assembler::v0_t);` > > Seems not, as by vector spec, `vector-immediate | imm[4:0]` and `The value is sign-extended to SEW bits`, so maximum of imm should be 16-1. Here is what I read from the RVV 1.0 spec [1]: The narrowing right shifts extract a smaller field from a wider operand and have both zero-extending (srl) and sign-extending (sra) forms. The shift amount can come from a vector register group, or a scalar x register, or a zero-extended 5-bit immediate. A zero-extended 5-bit immediate will have a maximum of 31. [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-narrowing-integer-right-shift-instructions ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1512091563 From sspitsyn at openjdk.org Tue Mar 5 03:40:04 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 03:40:04 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v22] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: addressed more comments on the fix and new test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17680/files - new: https://git.openjdk.org/jdk/pull/17680/files/b449f04a..ccf9484f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=20-21 Stats: 69 lines in 3 files changed: 39 ins; 4 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From fyang at openjdk.org Tue Mar 5 03:45:46 2024 From: fyang at openjdk.org (Fei Yang) Date: Tue, 5 Mar 2024 03:45:46 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 11:52:40 GMT, Hamlin Li wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1995: >> >>> 1993: auto stub = C2CodeStub::make >>> 1994: (dst, src, length, 24, float16_to_float_v_slow_path); >>> 1995: >> >> I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers. > > we could, but it's not necessary, as in `riscv_v.ad` it already has: > > instruct vconvHF2F(vReg dst, vReg src, vRegMask_V0 v0) %{ > ... > effect(TEMP_DEF dst, TEMP v0); We always do that in practice. An explicit assertion will help people better understand the code when they are solely looking at these assember functions. Otherwise they will have to check the caller side about the constraint on register usage. Also it could happen in the future that the same code might be reused or called from other newly-added code. It will help avoid errors on passing registers when people see such an assertion on entry. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1512096823 From sspitsyn at openjdk.org Tue Mar 5 04:17:57 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 04:17:57 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v5] In-Reply-To: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: > The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. > With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. > > The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. > > CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly > RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > Testing: > - tested with the mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: replaced timed with timed-out in JDWP and JDI spec ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17944/files - new: https://git.openjdk.org/jdk/pull/17944/files/1a3d1c19..c30655a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=03-04 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17944.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17944/head:pull/17944 PR: https://git.openjdk.org/jdk/pull/17944 From sspitsyn at openjdk.org Tue Mar 5 04:17:58 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 04:17:58 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v4] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: <3Y6fAWEDaxZrANXhwysEVXghraID8dLL5rA9-QlC_7M=.7e9d75d0-0a97-407f-9329-802416b53c0e@github.com> On Fri, 1 Mar 2024 11:19:21 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. >> With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. >> >> The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. >> >> CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly >> RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly >> >> Testing: >> - tested with the mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge > - Merge > - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method > - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly David, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17944#issuecomment-1977937253 From sspitsyn at openjdk.org Tue Mar 5 04:17:59 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 04:17:59 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v4] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Mon, 4 Mar 2024 09:49:07 GMT, Serguei Spitsyn wrote: >> src/java.se/share/data/jdwp/jdwp.spec line 1985: >> >>> 1983: "thread may be waiting to enter the object's monitor, or in " >>> 1984: "java.lang.Object.wait waiting to re-enter the monitor after being " >>> 1985: "notified, interrupted, or timeout." >> >> `timed-out` would be the correct sense here. > > Thank you, David. I was thinking about it but was not sure. Will update it. Fixed now. Also, updated CSR and RN. >> src/jdk.jdi/share/classes/com/sun/jdi/ThreadReference.java line 314: >> >>> 312: * synchronized method, the synchronized statement, or >>> 313: * {@link Object#wait} waiting to re-enter the monitor >>> 314: * after being notified, interrupted, or timeout. >> >> Again `timed-out`. > > Will fix, thanks. Fixed now. Also, updated CSR and RN. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1512112147 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1512112210 From sspitsyn at openjdk.org Tue Mar 5 06:18:56 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 06:18:56 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v6] In-Reply-To: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: > The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. > With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. > > The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. > > CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly > RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > Testing: > - tested with the mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: added new internal function JvmtiEnvBase::get_thread_or_vthread_state ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17944/files - new: https://git.openjdk.org/jdk/pull/17944/files/c30655a3..9b69af50 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=04-05 Stats: 24 lines in 3 files changed: 14 ins; 7 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17944.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17944/head:pull/17944 PR: https://git.openjdk.org/jdk/pull/17944 From epeter at openjdk.org Tue Mar 5 06:37:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 5 Mar 2024 06:37:59 GMT Subject: RFR: 8325095: C2: bailout message broken: ResourceArea allocated string used after free [v2] In-Reply-To: <5TgeG3ZAKwf86zhRI10h3kGIO7-Dl_HTiHQjFe33iP4=.d47f443f-14db-458e-a449-d6775f7ef2dd@github.com> References: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> <22-O7o2m6ZAJx3YmbAuY2dYcN47gTdeAFUu0m7SKy5Q=.7909c9ed-872f-4d95-9e78-d725f5deb002@github.com> <5TgeG3ZAKwf86zhRI10h3kGIO7-Dl_HTiHQjFe33iP4=.d47f443f-14db-458e-a449-d6775f7ef2dd@github.com> Message-ID: <9t3xVKKL5hI7wYHLaC8uCruftAN4Vtkaob4UM3OTKiI=.ac8f882b-cd58-4f12-8021-c98eab8a0c82@github.com> On Fri, 23 Feb 2024 18:31:44 GMT, Vladimir Kozlov wrote: >> @vnkozlov @tstuefe thanks for the comments and suggestions! >> >> What I did: >> - Moved `utilities/cHeapStringHolder.hpp/cpp` -> `compiler/cHeapStringHolder.hpp/cpp`. >> - Directly use `mtCompiler` for it, rather than constructor argument with `MEMFLAGS`. >> - Fixed 2 issues (bugs and code style) >> >> What I did not yet do: >> Any changes to logging. I think we want to do that in a separate RFE anyway, and not backport it together with this bugfix. >> Is there anything that you still need me to change for this bugfix here? >> >> About Logging, in a future RFE: >> @vnkozlov suggested to always UL failures immediately. The question is where I should do that. At `ciEnv::record_method_not_compilable` only? Or also in `Compile::record_method_not_compilable`? >> And @vnkozlov also suggested only to report one (a single) failure reason at the very end, presumably at the `CompileTask` level. Does that mean you would still overwrite the less important failures with more important ones in `ciEnv`, like we do currently? > >>About Logging, in a future RFE: > @vnkozlov suggested to always UL failures immediately. The question is where I should do that. At ciEnv::record_method_not_compilable only? Or also in Compile::record_method_not_compilable? > > `Compile::record_method_not_compilable()` calls `ciEnv::record_method_not_compilable()` so you need only add it to ciEnv. > >>And @vnkozlov also suggested only to report one (a single) failure reason at the very end, presumably at the CompileTask level. Does that mean you would still overwrite the less important failures with more important ones in ciEnv, like we do currently? > > I suggest to leave the normal report code as it is now (one failure). And if we need additional information we will use UL when we implement it. > > Most ciEnv failures are recorded (`record_failure`) during publishing compiled code in `ciEnv::register_method()` after compilation is done and it did not have failures. Thanks @vnkozlov @dlunde @tstuefe @dholmes-ora for reviewing / your suggestions! Thanks @jdksjolen for all the memory allocation conversations! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17710#issuecomment-1978056891 From epeter at openjdk.org Tue Mar 5 06:37:59 2024 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 5 Mar 2024 06:37:59 GMT Subject: Integrated: 8325095: C2: bailout message broken: ResourceArea allocated string used after free In-Reply-To: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> References: <8Bl6JSxb-fd1xAsKOMzBakRFuruoqJykP7JEn57k6qY=.717f89af-2aa7-439b-afd1-768dd39f51da@github.com> Message-ID: <2IYxRdaw0y2QbYYKjTpWEJlC4wfOXtOBvoKHEseDBd8=.dec855f6-0714-40b3-83dc-4793b994da30@github.com> On Mon, 5 Feb 2024 16:02:57 GMT, Emanuel Peter wrote: > **Problem** > Bailout `failure_reason` strings used to be either `nullptr` or a static string. But with [JDK-8303951](https://bugs.openjdk.org/browse/JDK-8303951) I had introduced dynamic strings (e.g. formatted using `stringStream::as_string()`). These dynamic strings were ResourceArea allocated, and have a very limited life-span, i.e. withing the most recent ResourceMark scope. The pointer is passed to `Compile::_failure_reason`, which already might leave that ResourceMark scope, and the pointer is invalid. And then the pointer is further passed to `ciEnv::_failure_reason`. And finally, it even gets passed up to `CompileBroker::invoke_compiler_on_method`. When the string is finally printed for the bailout message, we already left original `ResourceMark` scope, and also the `Compile` and `ciEnv` scopes. The pointer now points to basically random data. > > **Solution** > Whenever a string is passed to an outer scope, I make a CHeap copy, and the outer scope is the owner of that copy. > This way every scope is the owner of its own copy, and can allocate and deallocate according to its own strategy safely. > > I introduced a utility class `CHeapStringHolder`: > - `set`: make local copy on CHeap. > - `clear`: free local copy. We before `set`, and in the destructor. Thus, when the holder goes out of scope, the memory is automatically freed. > > We have these 4 scopes: > - `ResourceMark`: It allocates from `ResourcArea`, and deallocation is taken care at the end of the ResourceMark scope. > - `Compile`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while the `ResourceMark` is live. > - `ciEnv`: We turn the `_failure_reason` from a `char*` into a `CHeapStringHolder`. We set `_failure_reason` while `Compile` is live. > - `CompileTask`: We used to just set `failure_reason`, which assumes that the string is static or elsewhere managed. Now I use pattern available in other places of `CompileBroker::invoke_compiler_on_method`: duplicate the string with `os::strdup` from some other scope, and set `reason_on_C_heap = true`, which means that we are the owner of that copy, and are responsible for freeing it later. Not sure if that is a nice pattern, but I don't want to refactor that code and it does what I need it to do. This pull request has now been integrated. Changeset: c5895558 Author: Emanuel Peter URL: https://git.openjdk.org/jdk/commit/c589555845e61cdde5340aaa76fcc36b2753240d Stats: 128 lines in 9 files changed: 109 ins; 5 del; 14 mod 8325095: C2: bailout message broken: ResourceArea allocated string used after free Reviewed-by: kvn, dlong ------------- PR: https://git.openjdk.org/jdk/pull/17710 From dholmes at openjdk.org Tue Mar 5 06:48:52 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Mar 2024 06:48:52 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v22] In-Reply-To: References: Message-ID: <4XAkV4CDJUTXFeKCCkAQlD_HntnLWwd_AqcTEMowS4k=.f5d2572f-3061-4a56-bf9d-e97bda57e3a0@github.com> On Tue, 5 Mar 2024 03:40:04 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: addressed more comments on the fix and new test Updates look good. A couple of minor commenst. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1507: > 1505: nWait = 0; > 1506: for (ObjectWaiter* waiter = mon->first_waiter(); > 1507: waiter != nullptr && (nWait == 0 || waiter != mon->first_waiter()); Sorry I do not understand the logic of this line. `waiters` is just a linked-list of `ObjectWaiter`s so all we need to do is traverse it and count the number of elements. test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 36: > 34: * - unowned object without any waiting threads > 35: * - unowned object with threads waiting to be notified > 36: * - owned object without any waiting threads You could say "threads waiting" in all cases - it looks odd to reverse it for two cases. ------------- PR Review: https://git.openjdk.org/jdk/pull/17680#pullrequestreview-1915963487 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1512195706 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1512197846 From kbarrett at openjdk.org Tue Mar 5 07:12:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 5 Mar 2024 07:12:09 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: > Please review this change to update the HotSpot Style Guide's discussion of > nullptr and its use. > > I suggest this is an editorial rather than substantive change to the style > guide. As such, the normal HotSpot PR process can be used for this change. Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: respond to shipilev comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18101/files - new: https://git.openjdk.org/jdk/pull/18101/files/fdfcdc58..13dc01ac Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18101&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18101&range=00-01 Stats: 17 lines in 2 files changed: 0 ins; 5 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/18101.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18101/head:pull/18101 PR: https://git.openjdk.org/jdk/pull/18101 From kbarrett at openjdk.org Tue Mar 5 07:12:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 5 Mar 2024 07:12:09 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Mon, 4 Mar 2024 18:01:35 GMT, Vladimir Kozlov wrote: >> I think it would be enough to write 1..2 sentences about this, and then defer to N2431 already linked here for more details. > > I agree with Aleksey. Good point. I decided just referring to the paper for rationale is sufficient. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18101#discussion_r1512223990 From kbarrett at openjdk.org Tue Mar 5 07:12:09 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 5 Mar 2024 07:12:09 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Mon, 4 Mar 2024 09:52:15 GMT, Aleksey Shipilev wrote: >> Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: >> >> respond to shipilev comments > > doc/hotspot-style.md line 738: > >> 736: expressions with value zero. C++14 replaced that with integer literals with >> 737: value zero. Some compilers continue to treat a zero-valued integral constant >> 738: expressions as a _null pointer constant_ even in C++14 mode. > > It is not very clear why should this concern any Hotspot developers? If one is familiar with C++14 (or later) or were to look up _null pointer constant_ in the C++14 standard, one might wonder why this style guide admonishes against using non-literal constant expressions for such. But I've rewritten the part about 0 as a pointer to (I think) make things clearer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18101#discussion_r1512225238 From dholmes at openjdk.org Tue Mar 5 07:21:46 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Mar 2024 07:21:46 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Tue, 5 Mar 2024 07:12:09 GMT, Kim Barrett wrote: >> Please review this change to update the HotSpot Style Guide's discussion of >> nullptr and its use. >> >> I suggest this is an editorial rather than substantive change to the style >> guide. As such, the normal HotSpot PR process can be used for this change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > respond to shipilev comments Current revised text looks good to me. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18101#pullrequestreview-1916033052 From sspitsyn at openjdk.org Tue Mar 5 07:43:46 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 07:43:46 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v4] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Mon, 4 Mar 2024 09:51:32 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 941: >> >>> 939: bool is_virtual = java_lang_VirtualThread::is_instance(thread_oop); >>> 940: jint state = is_virtual ? JvmtiEnvBase::get_vthread_state(thread_oop, java_thread) >>> 941: : JvmtiEnvBase::get_thread_state(thread_oop, java_thread); >> >> I've seen this in a couple of your PRs now. It would be nice to have a utility function to get the JVMTI thread state of any java.lang.Thread instance. > > Good suggestion, thanks. Will add. Added now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1512266627 From mdoerr at openjdk.org Tue Mar 5 08:51:00 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 5 Mar 2024 08:51:00 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v28] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Tue, 27 Feb 2024 08:31:15 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > indendt jdk21u is only open for critical backports and requires special approval. Please backport it to jdk21u-dev. What about jdk22u? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1978239031 From sspitsyn at openjdk.org Tue Mar 5 08:58:52 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 08:58:52 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v22] In-Reply-To: <4XAkV4CDJUTXFeKCCkAQlD_HntnLWwd_AqcTEMowS4k=.f5d2572f-3061-4a56-bf9d-e97bda57e3a0@github.com> References: <4XAkV4CDJUTXFeKCCkAQlD_HntnLWwd_AqcTEMowS4k=.f5d2572f-3061-4a56-bf9d-e97bda57e3a0@github.com> Message-ID: On Tue, 5 Mar 2024 06:30:20 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: addressed more comments on the fix and new test > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1507: > >> 1505: nWait = 0; >> 1506: for (ObjectWaiter* waiter = mon->first_waiter(); >> 1507: waiter != nullptr && (nWait == 0 || waiter != mon->first_waiter()); > > Sorry I do not understand the logic of this line. `waiters` is just a linked-list of `ObjectWaiter`s so all we need to do is traverse it and count the number of elements. The loop is endless without this extra condition, so we are getting a test execution timeout. The `waiters` seems to be `circular doubly linked list` as we can see below: inline void ObjectMonitor::AddWaiter(ObjectWaiter* node) { assert(node != nullptr, "should not add null node"); assert(node->_prev == nullptr, "node already in list"); assert(node->_next == nullptr, "node already in list"); // put node at end of queue (circular doubly linked list) if (_WaitSet == nullptr) { _WaitSet = node; node->_prev = node; node->_next = node; } else { ObjectWaiter* head = _WaitSet; ObjectWaiter* tail = head->_prev; assert(tail->_next == head, "invariant check"); tail->_next = node; head->_prev = node; node->_next = head; node->_prev = tail; } } I'll make sure nothing is missed and check the old code again. > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 36: > >> 34: * - unowned object without any waiting threads >> 35: * - unowned object with threads waiting to be notified >> 36: * - owned object without any waiting threads > > You could say "threads waiting" in all cases - it looks odd to reverse it for two cases. Thanks, updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1512386413 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1512391923 From sspitsyn at openjdk.org Tue Mar 5 09:06:55 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Mar 2024 09:06:55 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v22] In-Reply-To: References: <4XAkV4CDJUTXFeKCCkAQlD_HntnLWwd_AqcTEMowS4k=.f5d2572f-3061-4a56-bf9d-e97bda57e3a0@github.com> Message-ID: <5lsZBFSbWmDQVJe9tGr2Rll00SWkbU7HBuRpEC_-Duo=.167e62b6-ca47-4a93-b8ad-7c22b379221d@github.com> On Tue, 5 Mar 2024 08:53:42 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 1507: >> >>> 1505: nWait = 0; >>> 1506: for (ObjectWaiter* waiter = mon->first_waiter(); >>> 1507: waiter != nullptr && (nWait == 0 || waiter != mon->first_waiter()); >> >> Sorry I do not understand the logic of this line. `waiters` is just a linked-list of `ObjectWaiter`s so all we need to do is traverse it and count the number of elements. > > The loop is endless without this extra condition, so we are getting a test execution timeout. > The `waiters` seems to be `circular doubly linked list` as we can see below: > > inline void ObjectMonitor::AddWaiter(ObjectWaiter* node) { > assert(node != nullptr, "should not add null node"); > assert(node->_prev == nullptr, "node already in list"); > assert(node->_next == nullptr, "node already in list"); > // put node at end of queue (circular doubly linked list) > if (_WaitSet == nullptr) { > _WaitSet = node; > node->_prev = node; > node->_next = node; > } else { > ObjectWaiter* head = _WaitSet; > ObjectWaiter* tail = head->_prev; > assert(tail->_next == head, "invariant check"); > tail->_next = node; > head->_prev = node; > node->_next = head; > node->_prev = tail; > } > } > > I'll make sure nothing is missed and check the old code again. Okay. Now I see why the old code did not have endless loops. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1512409266 From aboldtch at openjdk.org Tue Mar 5 09:14:48 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 5 Mar 2024 09:14:48 GMT Subject: RFR: 8327093: Add slice function to BitMap API In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 23:14:18 GMT, Matias Saavedra Silva wrote: > The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. > > This implementation has two core methods: > `copy_of_range` allocates and returns a new word array as used internally by BitMap > `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. > > This change is verified with an included test. The parameter here `bool clear = true` does nothing and it is unclear to me why it is added. That `truncate(n)` is `[n:]` and not `[:n]` does not feel intuitive to me. Maybe having a `truncate_start` as well as `truncate_end` which covers both the explicit cases. The `copy_of_range` logic seems broken. It is caught in the 32-bit gtests. Maybe it should be extended to cover more edge cases. Maybe add some different striped bitmaps. src/hotspot/share/utilities/bitMap.cpp line 111: > 109: idx_t start_word = to_words_align_down(start_bit); > 110: idx_t end_word = to_words_align_up(end_bit); > 111: bm_word_t* const old_map = map(); I would rather all variables be const than just this one. src/hotspot/share/utilities/bitMap.cpp line 123: > 121: > 122: // Iterate the map backwards as the shift will result in carry-out bits > 123: for (idx_t i = end_word; i --> start_word;) { I guess that mutating `i` inside the conditional is to deal with `idx_t` being unsigned. But the `i --> start_word` rather than `i-- > start_word` looks strange to me. src/hotspot/share/utilities/bitMap.cpp line 126: > 124: // First iteration is a special case: > 125: // There may be left over bits in the last word that we want to keep while discarding the rest > 126: if (i == end_word - 1 && cutoff > 0) { Is it important that the msb are cleared (beyond the `size()` bit). Could always just do `new_map[i-start_word] = old_map[i] >> shift;` src/hotspot/share/utilities/bitMap.cpp line 135: > 133: > 134: // A full shift by BitsPerWord could be sign extended > 135: carry = old_map[i] << (BitsPerWord - shift) % BitsPerWord; This contains both UB (shift too large) and it is unclear to me what this does. Should this whole loop not simply be: ```C++ for (idx_t i = end_word; i-- > start_word;) { new_map[i-start_word] = old_map[i] >> shift; if (shift != 0) { new_map[i-start_word] |= carry; carry = old_map[i] << (BitsPerWord - shift); } } ------------- PR Review: https://git.openjdk.org/jdk/pull/18092#pullrequestreview-1916252315 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1512362746 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1512362335 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1512368106 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1512375862 From kbarrett at openjdk.org Tue Mar 5 09:29:46 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 5 Mar 2024 09:29:46 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: <7_6Fcdy3HrKJ_yFbA5uZZ00-Cd25fbSlvtQh2xWzdJ8=.3320344f-64f6-49bc-af58-4f5d3e27c2d6@github.com> On Tue, 5 Mar 2024 07:12:09 GMT, Kim Barrett wrote: >> Please review this change to update the HotSpot Style Guide's discussion of >> nullptr and its use. >> >> I suggest this is an editorial rather than substantive change to the style >> guide. As such, the normal HotSpot PR process can be used for this change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > respond to shipilev comments I just discovered gcc `-Wzero-as-null-pointer-constant`. We might be able to make some quick progress on those legacy uses of 0 pointers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18101#issuecomment-1978311253 From tschatzl at openjdk.org Tue Mar 5 09:46:53 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Mar 2024 09:46:53 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option Message-ID: Hi all, please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. Testing: local compilation, grepping sources for occurrences Thanks, Thomas8327239-obsolete-gclockeredenexpansionpercent ------------- Commit messages: - 8327239 Changes: https://git.openjdk.org/jdk/pull/18117/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18117&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327239 Stats: 6 lines in 2 files changed: 1 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18117.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18117/head:pull/18117 PR: https://git.openjdk.org/jdk/pull/18117 From shade at openjdk.org Tue Mar 5 10:14:51 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 5 Mar 2024 10:14:51 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Tue, 5 Mar 2024 07:12:09 GMT, Kim Barrett wrote: >> Please review this change to update the HotSpot Style Guide's discussion of >> nullptr and its use. >> >> I suggest this is an editorial rather than substantive change to the style >> guide. As such, the normal HotSpot PR process can be used for this change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > respond to shipilev comments Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18101#pullrequestreview-1916467258 From ihse at openjdk.org Tue Mar 5 10:55:01 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 5 Mar 2024 10:55:01 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v28] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <_1kK6WNrzAFFQItw0g4AMjdSKrGdrKVphWPFwt3Z-MI=.ef30ed38-7918-4243-bd86-3c0dc2f501af@github.com> On Tue, 5 Mar 2024 08:48:15 GMT, Martin Doerr wrote: >> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: >> >> indendt > > jdk21u is only open for critical backports and requires special approval. Please backport it to jdk21u-dev. > What about jdk22u? > `/backport` commands should be used in commits, not in closed PRs. > Not sure if it works without Committer privileges. @TheRealMDoerr FYI, `/backport` works in PRs as well as commits since some time ago... In fact, I would recommend it as the better way to create a backport, now that it exists. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1978482994 From rkennke at openjdk.org Tue Mar 5 10:58:03 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 5 Mar 2024 10:58:03 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: <0uHUk2SuaffnE1qPZbX27sRU1c3IgXRTG6RKdHbrvac=.9294bf2f-fee9-4a62-9d48-ea882c481982@github.com> References: <04QTriAjxYzeREOBd1Kka9Rq2yZpYZTweZ7qMkpHjfg=.4b4a0181-13d6-4e11-a384-cfe2e05ebe72@github.com> <0uHUk2SuaffnE1qPZbX27sRU1c3IgXRTG6RKdHbrvac=.9294bf2f-fee9-4a62-9d48-ea882c481982@github.com> Message-ID: On Mon, 4 Mar 2024 09:06:21 GMT, Galder Zamarre?o wrote: >> And description updated? > > Same of x86 Hi Galder, unfortunately this PR has already been intergrated before your review comments. I've opened [JDK-8327361](https://bugs.openjdk.org/browse/JDK-8327361) to track the additional fixes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1512625548 From rkennke at openjdk.org Tue Mar 5 11:15:04 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 5 Mar 2024 11:15:04 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: <0uHUk2SuaffnE1qPZbX27sRU1c3IgXRTG6RKdHbrvac=.9294bf2f-fee9-4a62-9d48-ea882c481982@github.com> References: <04QTriAjxYzeREOBd1Kka9Rq2yZpYZTweZ7qMkpHjfg=.4b4a0181-13d6-4e11-a384-cfe2e05ebe72@github.com> <0uHUk2SuaffnE1qPZbX27sRU1c3IgXRTG6RKdHbrvac=.9294bf2f-fee9-4a62-9d48-ea882c481982@github.com> Message-ID: On Mon, 4 Mar 2024 09:06:21 GMT, Galder Zamarre?o wrote: >> And description updated? > > Same of x86 @galderz please review #18120, thank you! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1512647787 From ayang at openjdk.org Tue Mar 5 11:42:47 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 5 Mar 2024 11:42:47 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 09:41:32 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. > > Testing: local compilation, grepping sources for occurrences > > Thanks, > Thomas8327239-obsolete-gclockeredenexpansionpercent `TestNewRatioFlag.java` uses this flag. Should tests be updated in this PR as well? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18117#issuecomment-1978565999 From mli at openjdk.org Tue Mar 5 12:07:47 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 5 Mar 2024 12:07:47 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 03:31:28 GMT, Fei Yang wrote: >> Seems not, as by vector spec, `vector-immediate | imm[4:0]` and `The value is sign-extended to SEW bits`, so maximum of imm should be 16-1. > > Here is what I read from the RVV 1.0 spec [1]: > > > The narrowing right shifts extract a smaller field from a wider operand and have both zero-extending (srl) and > sign-extending (sra) forms. The shift amount can come from a vector register group, or a scalar x register, or > a zero-extended 5-bit immediate. > > A zero-extended 5-bit immediate will have a maximum of 31. > > [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-narrowing-integer-right-shift-instructions You're right! But, we have this code for `vnsra_wi` on riscv: `guarantee(is_simm5(imm), "imm is invalid"); `, we need to correct it, I'll do it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1512709970 From fbredberg at openjdk.org Tue Mar 5 13:07:02 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 5 Mar 2024 13:07:02 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v2] In-Reply-To: References: Message-ID: > The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: > > > [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 > [6.693s][trace][continuations] local 0 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 > [6.693s][trace][continuations] - 3 locals 3 max stack > [6.693s][trace][continuations] - codelet: return entry points > [6.693s][trace][continuations] saved fp > [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address > [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 > [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp > [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) > [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method > [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp > [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) > [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache > [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) > [6.693s][trace][continuations] 0x000000409e14cc18: 0x0000004050401a7a interpre... Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into 8314508_relativized_ptr_frame_describe - Small white space fix - 8314508: Improve how relativized pointers are printed by frame::describe ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18102/files - new: https://git.openjdk.org/jdk/pull/18102/files/c5457c7f..4ade221c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18102&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18102&range=00-01 Stats: 13240 lines in 1183 files changed: 6449 ins; 2864 del; 3927 mod Patch: https://git.openjdk.org/jdk/pull/18102.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18102/head:pull/18102 PR: https://git.openjdk.org/jdk/pull/18102 From jbhateja at openjdk.org Tue Mar 5 13:16:48 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 5 Mar 2024 13:16:48 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v11] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 21:40:04 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > update asserts for vpmadd52l/hq Marked as reviewed by jbhateja (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17881#pullrequestreview-1916876320 From jbhateja at openjdk.org Tue Mar 5 14:13:48 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 5 Mar 2024 14:13:48 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: <_6dorzq67KAZsTBHBvbQRDi_xW70bFhJudnxbG88m6I=.33e06bd5-d5fc-4ba8-b740-437155d567cf@github.com> References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> <_6dorzq67KAZsTBHBvbQRDi_xW70bFhJudnxbG88m6I=.33e06bd5-d5fc-4ba8-b740-437155d567cf@github.com> Message-ID: <_5z5emOe-VqjE7REHmk72wtJ-X_MUggxilrkXFUjdPo=.e30bafc3-0fc4-4872-a99c-f22e383301e3@github.com> On Mon, 4 Mar 2024 21:36:36 GMT, Srinivas Vamsi Parasa wrote: >> It would be good to make this instruction generic. > > Please see the updated assert as suggested for vpmadd52[l/h]uq in the latest commit. [poly1305_spr_validation.patch](https://github.com/openjdk/jdk/files/14496404/poly1305_spr_validation.patch) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1512889086 From gli at openjdk.org Tue Mar 5 14:29:46 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 5 Mar 2024 14:29:46 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 11:40:26 GMT, Albert Mingkun Yang wrote: > `TestNewRatioFlag.java` uses this flag. Should tests be updated in this PR as well? `TestNewSizeFlags.java` and `TestSurvivorRatioFlag.java` use this flag too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18117#issuecomment-1978896757 From mli at openjdk.org Tue Mar 5 15:21:01 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 5 Mar 2024 15:21:01 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v5] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? > Thanks! > > ## Test > > test/jdk/java/lang/Float/Binary16ConversionNaN.java > test/jdk/java/lang/Float/Binary16Conversion.java > > hotspot/jtreg/compiler/intrinsics/float16 > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: correct uimm assert in vnsra_wi; use vnsra_wi instead of vnsra_wx; add register asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17698/files - new: https://git.openjdk.org/jdk/pull/17698/files/08dce191..370892ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=03-04 Stats: 13 lines in 2 files changed: 6 ins; 4 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17698.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17698/head:pull/17698 PR: https://git.openjdk.org/jdk/pull/17698 From mli at openjdk.org Tue Mar 5 15:21:01 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 5 Mar 2024 15:21:01 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3] In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 12:05:16 GMT, Hamlin Li wrote: >> Here is what I read from the RVV 1.0 spec [1]: >> >> >> The narrowing right shifts extract a smaller field from a wider operand and have both zero-extending (srl) and >> sign-extending (sra) forms. The shift amount can come from a vector register group, or a scalar x register, or >> a zero-extended 5-bit immediate. >> >> A zero-extended 5-bit immediate will have a maximum of 31. >> >> [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-narrowing-integer-right-shift-instructions > > You're right! > But, we have this code for `vnsra_wi` on riscv: `guarantee(is_simm5(imm), "imm is invalid"); `, we need to correct it, I'll do it. Fixed, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1513009326 From tschatzl at openjdk.org Tue Mar 5 15:22:03 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Mar 2024 15:22:03 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option [v2] In-Reply-To: References: Message-ID: <51QPiWJhgiAqXjFIUG7EEGJuI-PcOZChqvAGNbweMNE=.61e12d95-f3ca-4c5a-9623-3fad34b67517@github.com> > Hi all, > > please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. > > Testing: local compilation, grepping sources for occurrences > > Thanks, > Thomas8327239-obsolete-gclockeredenexpansionpercent Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Remove use in tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18117/files - new: https://git.openjdk.org/jdk/pull/18117/files/f8148ef9..4f676fbe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18117&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18117&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18117.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18117/head:pull/18117 PR: https://git.openjdk.org/jdk/pull/18117 From gli at openjdk.org Tue Mar 5 15:28:51 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 5 Mar 2024 15:28:51 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option [v2] In-Reply-To: <51QPiWJhgiAqXjFIUG7EEGJuI-PcOZChqvAGNbweMNE=.61e12d95-f3ca-4c5a-9623-3fad34b67517@github.com> References: <51QPiWJhgiAqXjFIUG7EEGJuI-PcOZChqvAGNbweMNE=.61e12d95-f3ca-4c5a-9623-3fad34b67517@github.com> Message-ID: On Tue, 5 Mar 2024 15:22:03 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. >> >> Testing: local compilation, grepping sources for occurrences >> >> Thanks, >> Thomas8327239-obsolete-gclockeredenexpansionpercent > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove use in tests Looks good. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18117#pullrequestreview-1917360660 From ayang at openjdk.org Tue Mar 5 15:28:52 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 5 Mar 2024 15:28:52 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option [v2] In-Reply-To: <51QPiWJhgiAqXjFIUG7EEGJuI-PcOZChqvAGNbweMNE=.61e12d95-f3ca-4c5a-9623-3fad34b67517@github.com> References: <51QPiWJhgiAqXjFIUG7EEGJuI-PcOZChqvAGNbweMNE=.61e12d95-f3ca-4c5a-9623-3fad34b67517@github.com> Message-ID: On Tue, 5 Mar 2024 15:22:03 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. >> >> Testing: local compilation, grepping sources for occurrences >> >> Thanks, >> Thomas8327239-obsolete-gclockeredenexpansionpercent > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove use in tests Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18117#pullrequestreview-1917360820 From mli at openjdk.org Tue Mar 5 16:34:13 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 5 Mar 2024 16:34:13 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v6] In-Reply-To: References: Message-ID: <0vM0lVtB3-SUUb31C0jXhWampvXUwYhuqG9GslvGkgE=.ddce2df0-f504-4060-921e-caf8ec507197@github.com> > Hi, > Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? > Thanks! > > ## Test > > test/jdk/java/lang/Float/Binary16ConversionNaN.java > test/jdk/java/lang/Float/Binary16Conversion.java > > hotspot/jtreg/compiler/intrinsics/float16 > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: filter tests with zvfh ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17698/files - new: https://git.openjdk.org/jdk/pull/17698/files/370892ed..bf5c1239 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=04-05 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17698.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17698/head:pull/17698 PR: https://git.openjdk.org/jdk/pull/17698 From smonteith at openjdk.org Tue Mar 5 16:54:45 2024 From: smonteith at openjdk.org (Stuart Monteith) Date: Tue, 5 Mar 2024 16:54:45 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers In-Reply-To: References: Message-ID: On Fri, 23 Feb 2024 08:11:24 GMT, Joshua Zhu wrote: > Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. > Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, > even the use of a floating point may cause the maximum 2048 bits stack occupied. > Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. > > In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 > > > ...... > 0x0000ffff684cfad8: stp x15, x18, [sp, #80] > 0x0000ffff684cfadc: sub sp, sp, #0x100 > 0x0000ffff684cfae0: str z16, [sp] > 0x0000ffff684cfae4: add x1, x13, #0x10 > 0x0000ffff684cfae8: mov x0, x16 > ;; 0xFFFF803F5414 > 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 > 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 > 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfaf8: blr x8 > 0x0000ffff684cfafc: mov x16, x0 > 0x0000ffff684cfb00: ldr z16, [sp] > 0x0000ffff684cfb04: add sp, sp, #0x100 > 0x0000ffff684cfb08: ptrue p7.b > 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] > ...... > > > could be optimized into: > > > ...... > 0x0000ffff684cfa50: stp x15, x18, [sp, #80] > 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() > 0x0000ffff684cfa58: add x1, x13, #0x10 > 0x0000ffff684cfa5c: mov x0, x16 > ;; 0xFFFF7FA942A8 > 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 > 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 > 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfa6c: blr x8 > 0x0000ffff684cfa70: mov x16, x0 > 0x0000ffff684cfa74: ldr d16, [sp], #16 > 0x0000ffff684cfa78: ptrue p7.b > 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] > ...... > > > Besides the above benefit, when we know what size of register is live, > we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. > > Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. Thanks, that helps - I can see you're saving/restoring the correct register lengths. Would it be possible to generate a testcase to test that registers are being saved/restored correctly? The following is a testcase that is an example of where this testing is done, although in this PR's case it isn't subroutines, but load/store barriers: https://github.com/openjdk/jdk/commit/4cd318756d4a8de64d25fb6512ecba9a008edfa1#diff-949a4a2f889be36be47e9b02b6d6cd1247768953b95a024f649878bac721fa04 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-1979212537 From iklam at openjdk.org Tue Mar 5 17:08:46 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 5 Mar 2024 17:08:46 GMT Subject: RFR: 8327093: Add slice function to BitMap API In-Reply-To: References: Message-ID: <61cOLItjj-oAi8BMgoWj3cFY-ga0OZnkcrx-MKtLxKM=.54146205-b97e-4aef-abb2-80d732f801dd@github.com> On Tue, 5 Mar 2024 09:11:11 GMT, Axel Boldt-Christmas wrote: > That `truncate(n)` is `[n:]` and not `[:n]` does not feel intuitive to me. Maybe having a `truncate_start` as well as `truncate_end` which covers both the explicit cases. The current treatment of the 1-parameter case is similarly to the Java API of [`String.substring(int start)`](https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#substring-int-), or the JavaScript [`slice()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/slice) API for arrays. So if we want the current behavior, it might be better to rename the function as `subset()` or `slice()`. However, `subset()` or `slice()` usually mean making a copy of an existing BitMap. Yet, with the current usage of C++ templates in bitMap.hpp, that's very difficult to make a copy of the BitMap object itself. E.g, many of the BitMaps are currently stack-allocated. It's unclear where a "copy" of such a BitMap object should live. (Note: `BitMapWithAllocator::allocate()` allocates a `bm_word_t` array, not a BitMap object). I think the name`truncate()` will align with the existing `resize()` API -- both of them modify the BitMap object in-place. However, a `truncate(n)` that means `[:n]` would do the exact same thing as `resize(n)`. It would be confusing to have two APIs that do the same thing. But I agree that a `truncate(n)` that means `[n:]` is also non-intuiative. So I think the best compromise is simply to not offer a single parameter version of `truncate(n)`. It would be fairly easy for the caller to specify `truncate(0, n)` or `truncate(n, bitmap->size_in_bits())` to indicate exactly what is needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18092#issuecomment-1979240338 From matsaave at openjdk.org Tue Mar 5 17:33:45 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 5 Mar 2024 17:33:45 GMT Subject: RFR: 8327093: Add slice function to BitMap API In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 08:45:03 GMT, Axel Boldt-Christmas wrote: >> The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. >> >> This implementation has two core methods: >> `copy_of_range` allocates and returns a new word array as used internally by BitMap >> `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. >> >> This change is verified with an included test. > > src/hotspot/share/utilities/bitMap.cpp line 126: > >> 124: // First iteration is a special case: >> 125: // There may be left over bits in the last word that we want to keep while discarding the rest >> 126: if (i == end_word - 1 && cutoff > 0) { > > Is it important that the msb are cleared (beyond the `size()` bit). Could always just do `new_map[i-start_word] = old_map[i] >> shift;` I suppose it isn't necessary to clear the leftover MSB so that detail can be excluded. The junk bits will still be present however and that may introduce some confusion when debugging. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1513224925 From kvn at openjdk.org Tue Mar 5 18:19:50 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 5 Mar 2024 18:19:50 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v2] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Tue, 5 Mar 2024 07:12:09 GMT, Kim Barrett wrote: >> Please review this change to update the HotSpot Style Guide's discussion of >> nullptr and its use. >> >> I suggest this is an editorial rather than substantive change to the style >> guide. As such, the normal HotSpot PR process can be used for this change. > > Kim Barrett has updated the pull request incrementally with one additional commit since the last revision: > > respond to shipilev comments Approved. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18101#pullrequestreview-1917796573 From matsaave at openjdk.org Tue Mar 5 18:35:58 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 5 Mar 2024 18:35:58 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v2] In-Reply-To: References: Message-ID: > The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. > > This implementation has two core methods: > `copy_of_range` allocates and returns a new word array as used internally by BitMap > `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. > > This change is verified with an included test. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Axel comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18092/files - new: https://git.openjdk.org/jdk/pull/18092/files/f190f78c..95683209 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=00-01 Stats: 33 lines in 3 files changed: 2 ins; 15 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/18092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18092/head:pull/18092 PR: https://git.openjdk.org/jdk/pull/18092 From matsaave at openjdk.org Tue Mar 5 18:35:58 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 5 Mar 2024 18:35:58 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v2] In-Reply-To: <61cOLItjj-oAi8BMgoWj3cFY-ga0OZnkcrx-MKtLxKM=.54146205-b97e-4aef-abb2-80d732f801dd@github.com> References: <61cOLItjj-oAi8BMgoWj3cFY-ga0OZnkcrx-MKtLxKM=.54146205-b97e-4aef-abb2-80d732f801dd@github.com> Message-ID: On Tue, 5 Mar 2024 17:05:50 GMT, Ioi Lam wrote: > But I agree that a `truncate(n)` that means `[n:]` is also non-intuiative. Looking back at this, I do agree with this and I think it's best for the user to input the entire range. That being said, I think it should be fine to keep `copy_of_range(n)` to signify `[n:]` since it is acting more inline with functions like `slice`. Since `copy_of_range()` is a private function meant to be used within BitMap, I think either `copy_of_range()`, `subset()`, or `slice()` are appropriate names. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18092#issuecomment-1979402340 From lmesnik at openjdk.org Tue Mar 5 19:07:52 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 5 Mar 2024 19:07:52 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v6] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Tue, 5 Mar 2024 06:18:56 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. >> With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. >> >> The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. >> >> CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly >> RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly >> >> Testing: >> - tested with the mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added new internal function JvmtiEnvBase::get_thread_or_vthread_state The test changes look good, need just small doc update. The jvmti test serviceability/jvmti/thread/GetCurrentContendedMonitor/contmon01/contmon01.java is correct and don't check wait monitors. I wonder if it makes sense to add test which verify that thread waiting in Object.wait() is not included into the result. test/hotspot/jtreg/vmTestbase/nsk/jdi/ObjectReference/waitingThreads/waitingthreads004a.java line 106: > 104: display("entered and notifyAll: synchronized (lockingObject) {}"); > 105: lockingObject.notifyAll(); > 106: Please update test documentation in TestDescription. line: - An object with threads waiting in Object.wait(long) method. should be updated/or another one added. test/hotspot/jtreg/vmTestbase/nsk/jdwp/ThreadReference/CurrentContendedMonitor/curcontmonitor001a.java line 88: > 86: // ensure that tested thread is waiting for monitor object > 87: synchronized (TestedClass.thread.monitor) { > 88: TestedClass.thread.monitor.notifyAll(); You need to update test documentation in TestDescription.java to explicitly say that test not "waiting" but exit from wait and waiting for monitor (contended). ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17944#pullrequestreview-1917844453 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1513326686 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1513323999 From coleenp at openjdk.org Tue Mar 5 19:50:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 5 Mar 2024 19:50:10 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock Message-ID: This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. Passes tier1-4. ------------- Commit messages: - 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock Changes: https://git.openjdk.org/jdk/pull/17739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17739&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308745 Stats: 167 lines in 11 files changed: 87 ins; 47 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/17739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17739/head:pull/17739 PR: https://git.openjdk.org/jdk/pull/17739 From coleenp at openjdk.org Tue Mar 5 19:50:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 5 Mar 2024 19:50:10 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: <4txhdBn1XABNVZaKM87JCepatOS9Hf4DekJiTwazPL4=.2de6834f-dfdc-4342-a311-46a13071cf55@github.com> On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-4. Thanks for looking at this WIP. ------------- PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1876761135 From dholmes at openjdk.org Tue Mar 5 19:50:10 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Mar 2024 19:50:10 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-4. src/hotspot/share/classfile/classLoaderData.cpp line 412: > 410: // To call this, one must have the MultiArray_lock held, but the _klasses list still has lock free reads. > 411: assert_locked_or_safepoint(MultiArray_lock); > 412: Do we no longer hold the lock or are you just missing the API to assert it is held? src/hotspot/share/oops/arrayKlass.cpp line 141: > 139: ObjArrayKlass::allocate_objArray_klass(class_loader_data(), dim + 1, this, CHECK_NULL); > 140: // use 'release' to pair with lock-free load > 141: release_set_higher_dimension(ak); Why has this code changed? I only expected to see the lock changed. src/hotspot/share/prims/jvmtiExport.cpp line 3151: > 3149: if (MultiArray_lock->owner() == thread) { > 3150: return false; > 3151: } So the recursive nature of the lock now means we don't have to bail out here - right? src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp line 107: > 105: // To get a consistent list of classes we need MultiArray_lock to ensure > 106: // array classes aren't created. > 107: MutexLocker ma(MultiArray_lock); Unclear why this is no longer the case. src/hotspot/share/runtime/recursiveLock.cpp line 45: > 43: _recursions++; > 44: assert(_recursions == 1, "should be"); > 45: Atomic::release_store(&_owner, current); // necessary? No release necessary. The only thing written since the sem.wait is recursions and it is only updated or needed by the owning thread. src/hotspot/share/runtime/recursiveLock.cpp line 54: > 52: _recursions--; > 53: if (_recursions == 0) { > 54: Atomic::release_store(&_owner, (Thread*)nullptr); // necessary? No release necessary. src/hotspot/share/runtime/recursiveLock.hpp line 33: > 31: // There are also no checks that the recursive lock is not held when going to Java or to JNI, like > 32: // other JVM mutexes have. This should be used only for cases where the alternatives with all the > 33: // nice safety features don't work. Mention that it does interact with safepoints properly for JavaThreads src/hotspot/share/runtime/recursiveLock.hpp line 45: > 43: void unlock(Thread* current); > 44: }; > 45: Should expose a `holdsLock` method to allow use in assertions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483613181 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483614823 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483623614 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483624460 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483608746 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483609712 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483610648 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1483612492 From coleenp at openjdk.org Tue Mar 5 19:50:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 5 Mar 2024 19:50:11 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Thu, 8 Feb 2024 21:30:48 GMT, David Holmes wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-4. > > src/hotspot/share/classfile/classLoaderData.cpp line 412: > >> 410: // To call this, one must have the MultiArray_lock held, but the _klasses list still has lock free reads. >> 411: assert_locked_or_safepoint(MultiArray_lock); >> 412: > > Do we no longer hold the lock or are you just missing the API to assert it is held? We no longer hold the lock in the callers to iterate through CLD::_klasses list. It was never needed. CLD::_klasses has iterators that are lock free (atomics) > src/hotspot/share/oops/arrayKlass.cpp line 141: > >> 139: ObjArrayKlass::allocate_objArray_klass(class_loader_data(), dim + 1, this, CHECK_NULL); >> 140: // use 'release' to pair with lock-free load >> 141: release_set_higher_dimension(ak); > > Why has this code changed? I only expected to see the lock changed. The assert is dumb, leftover from when we didn't have C++ types (only klassOop). Of course it's an objArrayKlass, that's its type! The higher dimension should be set in the constructor of ObjArrayKlass. Every version of this change, I move this assignment there. > src/hotspot/share/prims/jvmtiExport.cpp line 3151: > >> 3149: if (MultiArray_lock->owner() == thread) { >> 3150: return false; >> 3151: } > > So the recursive nature of the lock now means we don't have to bail out here - right? Yes. If it were only the JNI upcall that was broken while holding the MultiArray_lock, I think I could have used this same approach. Unfortunately, its also throwing OOM for metaspace and for Java heap :( > src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp line 107: > >> 105: // To get a consistent list of classes we need MultiArray_lock to ensure >> 106: // array classes aren't created. >> 107: MutexLocker ma(MultiArray_lock); > > Unclear why this is no longer the case. CLD::_klasses list is iterated through lock free with atomics. Older code used to iterate through the system dictionary which only had InstanceKlasses and then used to iterate through the arrays in InstanceKlass::_array_klasses. Which required the MultiArray_lock protecting it. There used to be an array_klasses_do() function. Now all the klasses are in the CLD::_klasses list and traversed atomically. > src/hotspot/share/runtime/recursiveLock.cpp line 45: > >> 43: _recursions++; >> 44: assert(_recursions == 1, "should be"); >> 45: Atomic::release_store(&_owner, current); // necessary? > > No release necessary. The only thing written since the sem.wait is recursions and it is only updated or needed by the owning thread. ok thanks. > src/hotspot/share/runtime/recursiveLock.cpp line 54: > >> 52: _recursions--; >> 53: if (_recursions == 0) { >> 54: Atomic::release_store(&_owner, (Thread*)nullptr); // necessary? > > No release necessary. ok, thanks. > src/hotspot/share/runtime/recursiveLock.hpp line 33: > >> 31: // There are also no checks that the recursive lock is not held when going to Java or to JNI, like >> 32: // other JVM mutexes have. This should be used only for cases where the alternatives with all the >> 33: // nice safety features don't work. > > Mention that it does interact with safepoints properly for JavaThreads Added a comment. > src/hotspot/share/runtime/recursiveLock.hpp line 45: > >> 43: void unlock(Thread* current); >> 44: }; >> 45: > > Should expose a `holdsLock` method to allow use in assertions. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487056926 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487058194 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487060684 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487062202 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487063071 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487063156 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1492431178 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1492430934 From coleenp at openjdk.org Tue Mar 5 19:50:11 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 5 Mar 2024 19:50:11 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: <2Z8YrflN2HIPhBO8JocpEQz6CEmjROczFOznVQbToeY=.3ba2e364-e48e-46ea-9db6-d9fe9725b553@github.com> On Tue, 13 Feb 2024 02:07:54 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/arrayKlass.cpp line 141: >> >>> 139: ObjArrayKlass::allocate_objArray_klass(class_loader_data(), dim + 1, this, CHECK_NULL); >>> 140: // use 'release' to pair with lock-free load >>> 141: release_set_higher_dimension(ak); >> >> Why has this code changed? I only expected to see the lock changed. > > The assert is dumb, leftover from when we didn't have C++ types (only klassOop). Of course it's an objArrayKlass, that's its type! The higher dimension should be set in the constructor of ObjArrayKlass. Every version of this change, I move this assignment there. Changing this like this makes it more similar to the InstanceKlass::array_klass function in structure, and there was unnecessary { } scopes in both and an unnecessary ResourceMark. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1487058776 From ccheung at openjdk.org Tue Mar 5 20:43:45 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 5 Mar 2024 20:43:45 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Sat, 2 Mar 2024 01:18:06 GMT, Ioi Lam wrote: > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. Looks good. Just one nit below. I assume that it has gone through some testing? src/hotspot/share/cds/cdsConfig.hpp line 94: > 92: static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } > 93: static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; > 94: static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); Please align this with line #90. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18095#pullrequestreview-1918091197 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1513475007 From dlong at openjdk.org Tue Mar 5 23:15:49 2024 From: dlong at openjdk.org (Dean Long) Date: Tue, 5 Mar 2024 23:15:49 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. Is the caller always a JavaThread? I'm wondering if your new recursive lock class could use the existing ObjectMonitor. I thought David asked the same question, but I can't find it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17739#issuecomment-1979796523 From iklam at openjdk.org Tue Mar 5 23:40:46 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 5 Mar 2024 23:40:46 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 20:39:10 GMT, Calvin Cheung wrote: >> A few clean ups: >> >> 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. >> >> 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. >> >> 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. >> >> 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. >> >> 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. > > src/hotspot/share/cds/cdsConfig.hpp line 94: > >> 92: static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } >> 93: static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; >> 94: static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); > > Please align this with line #90. Do you mean aligning like this: static bool is_dumping_heap() NOT_CDS_JAVA_HEAP_RETURN_(false); static void stop_dumping_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); I think this is more messy, and you can't easily tell that these four functions are all related to the same feature (full_module_graph). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1513626841 From ccheung at openjdk.org Tue Mar 5 23:51:45 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 5 Mar 2024 23:51:45 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 23:38:21 GMT, Ioi Lam wrote: >> src/hotspot/share/cds/cdsConfig.hpp line 94: >> >>> 92: static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } >>> 93: static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; >>> 94: static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); >> >> Please align this with line #90. > > Do you mean aligning like this: > > > static bool is_dumping_heap() NOT_CDS_JAVA_HEAP_RETURN_(false); > static void stop_dumping_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; > static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } > static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; > static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); > > > I think this is more messy, and you can't easily tell that these four functions are all related to the same feature (full_module_graph). I meant the following. Just the last line #94 needs to be changed - shift one space to the left after `bool`. static bool is_dumping_heap() NOT_CDS_JAVA_HEAP_RETURN_(false); static void stop_dumping_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1513633931 From sspitsyn at openjdk.org Wed Mar 6 00:40:52 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 6 Mar 2024 00:40:52 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v6] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Tue, 5 Mar 2024 18:39:43 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: added new internal function JvmtiEnvBase::get_thread_or_vthread_state > > test/hotspot/jtreg/vmTestbase/nsk/jdi/ObjectReference/waitingThreads/waitingthreads004a.java line 106: > >> 104: display("entered and notifyAll: synchronized (lockingObject) {}"); >> 105: lockingObject.notifyAll(); >> 106: > > Please update test documentation in TestDescription. line: > - An object with threads waiting in Object.wait(long) method. > should be updated/or another one added. Thanks, updated `TestDescription.java` now. > test/hotspot/jtreg/vmTestbase/nsk/jdwp/ThreadReference/CurrentContendedMonitor/curcontmonitor001a.java line 88: > >> 86: // ensure that tested thread is waiting for monitor object >> 87: synchronized (TestedClass.thread.monitor) { >> 88: TestedClass.thread.monitor.notifyAll(); > > You need to update test documentation in TestDescription.java to explicitly say that test not "waiting" but exit from wait and waiting for monitor (contended). Thanks, updated `TestDescription.java` now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1513663234 PR Review Comment: https://git.openjdk.org/jdk/pull/17944#discussion_r1513663121 From iklam at openjdk.org Wed Mar 6 00:45:50 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 00:45:50 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v2] In-Reply-To: References: <61cOLItjj-oAi8BMgoWj3cFY-ga0OZnkcrx-MKtLxKM=.54146205-b97e-4aef-abb2-80d732f801dd@github.com> Message-ID: On Tue, 5 Mar 2024 18:32:33 GMT, Matias Saavedra Silva wrote: > I think it should be fine to keep `copy_of_range(n)` to signify `[n:]` since it is acting more inline with functions like `slice`. But no one uses `copy_of_range(n)` today, so I think we shouldn't add such a variant until the need actually arises. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18092#issuecomment-1979875495 From sspitsyn at openjdk.org Wed Mar 6 01:07:59 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 6 Mar 2024 01:07:59 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v7] In-Reply-To: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: > The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. > With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. > > The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. > > CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly > RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > Testing: > - tested with the mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: updated test desciption for two modified tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17944/files - new: https://git.openjdk.org/jdk/pull/17944/files/9b69af50..289f6ff7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=05-06 Stats: 5 lines in 2 files changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17944.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17944/head:pull/17944 PR: https://git.openjdk.org/jdk/pull/17944 From iklam at openjdk.org Wed Mar 6 02:00:47 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 02:00:47 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 23:49:02 GMT, Calvin Cheung wrote: >> Do you mean aligning like this: >> >> >> static bool is_dumping_heap() NOT_CDS_JAVA_HEAP_RETURN_(false); >> static void stop_dumping_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; >> static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } >> static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; >> static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); >> >> >> I think this is more messy, and you can't easily tell that these four functions are all related to the same feature (full_module_graph). > > I meant the following. Just the last line #94 needs to be changed - shift one space to the left after `bool`. > > static bool is_dumping_heap() NOT_CDS_JAVA_HEAP_RETURN_(false); > static void stop_dumping_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; > static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } > static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; > static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); I wanted line 94 to align the "using_full_module_graph" part with the function at line 93. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1513713821 From ccheung at openjdk.org Wed Mar 6 02:12:46 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 6 Mar 2024 02:12:46 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 01:58:26 GMT, Ioi Lam wrote: >> I meant the following. Just the last line #94 needs to be changed - shift one space to the left after `bool`. >> >> static bool is_dumping_heap() NOT_CDS_JAVA_HEAP_RETURN_(false); >> static void stop_dumping_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; >> static bool is_dumping_full_module_graph() { return CDS_ONLY(_is_dumping_full_module_graph) NOT_CDS(false); } >> static void stop_using_full_module_graph(const char* reason = nullptr) NOT_CDS_JAVA_HEAP_RETURN; >> static bool is_using_full_module_graph() NOT_CDS_JAVA_HEAP_RETURN_(false); > > I wanted line 94 to align the "using_full_module_graph" part with the function at line 93. In that case, it looks good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1513723384 From iklam at openjdk.org Wed Mar 6 03:32:46 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 03:32:46 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v2] In-Reply-To: References: Message-ID: <5MgXk5nGxEJlP6RMa9JWbbiSxlo7QOD9UScPuIv4_dc=.5b42a435-ecb4-4747-8198-6f95797dc44e@github.com> On Tue, 5 Mar 2024 18:35:58 GMT, Matias Saavedra Silva wrote: >> The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. >> >> This implementation has two core methods: >> `copy_of_range` allocates and returns a new word array as used internally by BitMap >> `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. >> >> This change is verified with an included test. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Axel comments Changes requested by iklam (Reviewer). src/hotspot/share/utilities/bitMap.cpp line 118: > 116: > 117: // All words need to be shifted by this amount > 118: idx_t shift = bit_in_word(start_bit); `shift` should be const as well. test/hotspot/gtest/utilities/test_bitMap_truncate.cpp line 83: > 81: fillBitMap(map, BITMAP_SIZE); > 82: testTruncate(0, BITMAP_SIZE, map); > 83: } `testTruncateOneWord()` is pretty easy to understand, but I found it really hard to follow the logic of the other test cases. For example, it took me a while to understand the variable `map` in `testTruncateSame()` is not the same as the `map` in `testTruncate()`. Could all the tests be rewritten to the same style of `testTruncateOneWord()`, where the entire logic is inside the same function? ------------- PR Review: https://git.openjdk.org/jdk/pull/18092#pullrequestreview-1918661249 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1513767301 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1513760844 From iklam at openjdk.org Wed Mar 6 03:32:47 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 03:32:47 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v2] In-Reply-To: <5MgXk5nGxEJlP6RMa9JWbbiSxlo7QOD9UScPuIv4_dc=.5b42a435-ecb4-4747-8198-6f95797dc44e@github.com> References: <5MgXk5nGxEJlP6RMa9JWbbiSxlo7QOD9UScPuIv4_dc=.5b42a435-ecb4-4747-8198-6f95797dc44e@github.com> Message-ID: <8Kj3NhM5gIl-0JF2h7EjgvXyBGWibE8upN3gbnGuXUA=.2851df0c-4b2f-4226-af46-33cbe32bee02@github.com> On Wed, 6 Mar 2024 03:15:22 GMT, Ioi Lam wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Axel comments > > test/hotspot/gtest/utilities/test_bitMap_truncate.cpp line 83: > >> 81: fillBitMap(map, BITMAP_SIZE); >> 82: testTruncate(0, BITMAP_SIZE, map); >> 83: } > > `testTruncateOneWord()` is pretty easy to understand, but I found it really hard to follow the logic of the other test cases. For example, it took me a while to understand the variable `map` in `testTruncateSame()` is not the same as the `map` in `testTruncate()`. > > Could all the tests be rewritten to the same style of `testTruncateOneWord()`, where the entire logic is inside the same function? Also, I think the contents of this file should be moved inside test_bitMap.cpp, so you don't need to repeat things like TestArenaBitMap. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1513763401 From dholmes at openjdk.org Wed Mar 6 07:00:48 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Mar 2024 07:00:48 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 23:13:30 GMT, Dean Long wrote: > I'm wondering if your new recursive lock class could use the existing ObjectMonitor. There has been a drive to remove `ObjectLocker` from the C++ code due to the negative impact on Loom. Also not sure what existing `ObjectMonitor` you are referring to. ?? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17739#issuecomment-1980207637 From jkern at openjdk.org Wed Mar 6 09:59:49 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 6 Mar 2024 09:59:49 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v3] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 09:06:03 GMT, Suchismith Roy wrote: >> Remove redefinition of NULL for XLC compiler >> In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). >> Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. >> >> >> JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) > > Suchismith Roy has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > remove AIX check for NULL Changes requested by jkern (Author). src/hotspot/share/utilities/globalDefinitions.cpp line 160: > 158: static_assert((size_t)HeapWordSize >= sizeof(juint), > 159: "HeapWord should be at least as large as juint"); > 160: static_assert(sizeof(nullptr) == sizeof(char*), "NULL must be same size as pointer"); After this change that line makes no sense anymore. If there is no NULL in the complete code you can remove the line. If there are still some NULL than you have to keep the previous state. ------------- PR Review: https://git.openjdk.org/jdk/pull/18064#pullrequestreview-1919279460 PR Review Comment: https://git.openjdk.org/jdk/pull/18064#discussion_r1514171126 From fyang at openjdk.org Wed Mar 6 10:12:49 2024 From: fyang at openjdk.org (Fei Yang) Date: Wed, 6 Mar 2024 10:12:49 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v6] In-Reply-To: <0vM0lVtB3-SUUb31C0jXhWampvXUwYhuqG9GslvGkgE=.ddce2df0-f504-4060-921e-caf8ec507197@github.com> References: <0vM0lVtB3-SUUb31C0jXhWampvXUwYhuqG9GslvGkgE=.ddce2df0-f504-4060-921e-caf8ec507197@github.com> Message-ID: On Tue, 5 Mar 2024 16:34:13 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? >> Thanks! >> >> ## Test >> >> test/jdk/java/lang/Float/Binary16ConversionNaN.java >> test/jdk/java/lang/Float/Binary16Conversion.java >> >> hotspot/jtreg/compiler/intrinsics/float16 >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > filter tests with zvfh Thanks for the quick update. Several minor comments remain after another look. Looks good otherwise. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1923: > 1921: vcpop_m(t0, v0); > 1922: > 1923: // non-NaN or non-Inf cases, just use built-in instructions. Suggestion: s/non-NaN or non-Inf cases/For non-NaN or non-Inf cases/ src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1927: > 1925: > 1926: // jump to stub processing NaN and Inf cases if there is any of them in the vector-wide. > 1927: bgtz(t0, stub->entry()); I prefer to use `bnez` instead of `bgtz` here to be consistent with NaN check in `C2_MacroAssembler::float_to_float16_v` src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1961: > 1959: // j.l.Float.float16ToFloat > 1960: void C2_MacroAssembler::float_to_float16_v(VectorRegister dst, VectorRegister src, VectorRegister tmp, uint length) { > 1961: assert_different_registers(dst, src, tmp); Maybe rename `tmp` to `vtmp` to indicate that this is temp vector register? src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1964: > 1962: > 1963: auto stub = C2CodeStub::make > 1964: (dst, src, tmp, length, 36, float_to_float16_v_slow_path); You might want to reduce the stub size from 36 to 28 after the stub changes you made. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1977: > 1975: // and also moving vsetvli_helper(..., mf2) here does not impact vcpop_m. > 1976: vsetvli_helper(BasicType::T_SHORT, length, Assembler::mf2); > 1977: vcpop_m(t0, v0); It looks a bit weird to place this `vcpop_m` here even though it works in functionality. Can we reserve and pass another temp integer register to store the result of this `vcpop_m` and move this it immediately after `vmfne_vv`? Then the code will be more readable to me. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1979: > 1977: vcpop_m(t0, v0); > 1978: > 1979: // non-NaN cases, just use built-in instructions. Suggestion: s/non-NaN cases/For non-NaN cases/ src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp line 191: > 189: > 190: void float16_to_float_v(VectorRegister dst, VectorRegister src, uint length); > 191: void float_to_float16_v(VectorRegister dst, VectorRegister src, VectorRegister tmp, uint length); Can you rename `length` to `vector_length` to be consistent with other assember friends in naming? ------------- Changes requested by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17698#pullrequestreview-1918718588 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1514141674 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1513984137 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1514051181 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1513873863 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1514135659 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1514140774 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1513799254 From tschatzl at openjdk.org Wed Mar 6 11:37:51 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Mar 2024 11:37:51 GMT Subject: RFR: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 11:40:26 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. >> >> Testing: local compilation, grepping sources for occurrences >> >> Thanks, >> Thomas8327239-obsolete-gclockeredenexpansionpercent > > `TestNewRatioFlag.java` uses this flag. Should tests be updated in this PR as well? Thanks @albertnetymk @lgxbslgx for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/18117#issuecomment-1980674888 From tschatzl at openjdk.org Wed Mar 6 11:37:51 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Mar 2024 11:37:51 GMT Subject: Integrated: 8327239: Obsolete unused GCLockerEdenExpansionPercent product option In-Reply-To: References: Message-ID: <4kcuAXF-2ChfTQQEx8co31y7ZYM9mkQkFlMXVFv_NYA=.3127847b-3405-4a2d-9f93-187f28a1c80d@github.com> On Tue, 5 Mar 2024 09:41:32 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change to obsolete the unused flag GCLockerEdenExpansionPercent. Please also review the CSR. > > Testing: local compilation, grepping sources for occurrences > > Thanks, > Thomas8327239-obsolete-gclockeredenexpansionpercent This pull request has now been integrated. Changeset: 992104d4 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/992104d47785e6ce7c1f41ac8e2fd452022b8834 Stats: 9 lines in 5 files changed: 1 ins; 8 del; 0 mod 8327239: Obsolete unused GCLockerEdenExpansionPercent product option Reviewed-by: gli, ayang ------------- PR: https://git.openjdk.org/jdk/pull/18117 From ihse at openjdk.org Wed Mar 6 12:04:17 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 12:04:17 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code Message-ID: Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. ------------- Commit messages: - 8327460: Compile tests with the same visibility rules as product code Changes: https://git.openjdk.org/jdk/pull/18135/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18135&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327460 Stats: 303 lines in 38 files changed: 69 ins; 159 del; 75 mod Patch: https://git.openjdk.org/jdk/pull/18135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18135/head:pull/18135 PR: https://git.openjdk.org/jdk/pull/18135 From ihse at openjdk.org Wed Mar 6 12:16:45 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 12:16:45 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:33:30 GMT, Magnus Ihse Bursie wrote: > Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. Most of these changes is just putting the `EXPORT` define in `export.h` instead (where support for `__attribute__((visibility("default")))` is also added). However, there are a few other changes worth mentioning: * `test/jdk/java/lang/Thread/jni/AttachCurrentThread/libImplicitAttach.c` was missing an export. This had not been discovered before since that file was not compiled on Windows. * On macOS, the `main` function of a Java launcher needs to be exported, since we re-launch the main function when starting a new thread. (This is utterly weird behavior driven by how on macOS the main thread is reserved for Cocoa events, and I'm not sure this is properly documented by the JNI specification). * The `dereference_null` function from `libTestDwarfHelper.h` is looked for in the Hotspot hs_err stack trace in `test/hotspot/jtreg/runtime/ErrorHandling/TestDwarf.java`. When compiling with `-fvisibility=hidden`, it became inline and the test failed. I solved this by exporting the function, thus restoring the same situation as was before. I'm not sure this is the correct or best solution though, since it depends on what you expect `TestDwarf.java` to really achieve. (You can't expect of it to perform miracles and show functions that has been inlined in stack traces, though...) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1980739771 From coleenp at openjdk.org Wed Mar 6 13:01:50 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 6 Mar 2024 13:01:50 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 23:13:30 GMT, Dean Long wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Is the caller always a JavaThread? I'm wondering if your new recursive lock class could use the existing ObjectMonitor. I thought David asked the same question, but I can't find it. @dean-long An ObjectLocker on the mirror oop for InstanceKlass might work for this but as @dholmes-ora said we've been trying to purge the ObjectLocker code from the C++ code because it's complicated by the Java Object monitor project. The JOM project may throw exceptions from the ObjectLocker constructor or destructor and the C++ code doesn't currently know what to do. We removed the ObjectLockers around class linking and some JVMTI cases. There are some in class loading but with a small amount of work, they can be removed also. The caller is always a JavaThread, some lockers are at a safepoint for traversal but the multi-arrays are only created by JavaThreads. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17739#issuecomment-1980812019 From sroy at openjdk.org Wed Mar 6 13:02:01 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 6 Mar 2024 13:02:01 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v4] In-Reply-To: References: Message-ID: > Remove redefinition of NULL for XLC compiler > In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). > Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. > > > JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: Remove null check ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18064/files - new: https://git.openjdk.org/jdk/pull/18064/files/21fd7a2c..60bdf794 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18064.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18064/head:pull/18064 PR: https://git.openjdk.org/jdk/pull/18064 From ihse at openjdk.org Wed Mar 6 13:43:00 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 13:43:00 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: > Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: Update line number for dereference_null in TestDwarf ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18135/files - new: https://git.openjdk.org/jdk/pull/18135/files/1dc79758..b1bf01ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18135&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18135&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18135/head:pull/18135 PR: https://git.openjdk.org/jdk/pull/18135 From erikj at openjdk.org Wed Mar 6 13:51:46 2024 From: erikj at openjdk.org (Erik Joelsson) Date: Wed, 6 Mar 2024 13:51:46 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 13:43:00 GMT, Magnus Ihse Bursie wrote: >> Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Update line number for dereference_null in TestDwarf Build system changes look good to me. The specific test changes probably needs other reviewers. I assume you have run all the affected tests on a wide variety of platforms? ------------- Marked as reviewed by erikj (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18135#pullrequestreview-1919814110 From ihse at openjdk.org Wed Mar 6 13:55:46 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 13:55:46 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 13:43:00 GMT, Magnus Ihse Bursie wrote: >> Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Update line number for dereference_null in TestDwarf I have run the tests in tier1, tier2 and tier3 on the Oracle internal CI, which includes linux/x64, linux/aarch64, windows/x64, macosx/x64 and macosx/aarch64. I am currently running tier 4-10; this will take a while (and I won't integrate until this has finished running). When running multiple high-tier tests like this, the likelihood that I encounter intermittent problems increase. I will ignore test failures that seem likely to be intermittent and not caused by this patch. I might be overdoing the testing, but since this change affects **all** native tests, I want to be absolutely sure I don't break any tests that only execute as part of a high tier. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1980924971 From ihse at openjdk.org Wed Mar 6 17:19:47 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 17:19:47 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: <0xgoI47fdqsPHnjiCtHKO72SeGA_PGu3CAgBYZs-gBc=.8d390c92-1642-4dfd-8527-4b4b36715915@github.com> On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. I agree that it is good to get rid of this. However, the reason it has stuck around for so long is, eh, that it has been around for so long, so it is a bit unclear what or who could be relying on this. Let's not be too eager to push this, and collect a bit more information about this. ------------- Marked as reviewed by ihse (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18136#pullrequestreview-1920369274 From ihse at openjdk.org Wed Mar 6 17:24:50 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 17:24:50 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: <1WzRK8Trhlpv0c_S2vGScX1xG9LBP9KPWg0QDloBjf8=.e3b34f7a-d2e2-40c0-8dd4-f53dc90507f3@github.com> On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. What testing have you run? I imagine there are tests that check/read the Hotspot version string. In fact, the more I think about it, the more I believe a better approach is to replace the variable user name with a fixed dummy replacement as a first step. That would only need for verification that nobody really cares about the user name. Then we can, as a follow up, change the version string (if we really care) -- this has a much higher chance of breakage, and it is not by far as important to solve, so it can easily be dropped if it causes problems. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1981394987 From ihse at openjdk.org Wed Mar 6 17:24:49 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Mar 2024 17:24:49 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: <0_NOdLGVPh_j1qEa_VtTcYXqWhFMr4cUD67cL-TQNiI=.87cf982f-e9f4-4929-9a9f-ff3bf82b44dd@github.com> On Wed, 6 Mar 2024 15:24:59 GMT, Andrew John Hughes wrote: >> I'm fine with this change. I wasn't around when this was introduced, but my guess is that it was relevant back when Hotspot and the rest of the JDK were often built separately. We have the username in the default $OPT string for personal builds, so the HOTSPOT_BUILD_USER is just redundant information. There is no reason to have the build user recorded in non personal builds. > >> I'm fine with this change. I wasn't around when this was introduced, but my guess is that it was relevant back when Hotspot and the rest of the JDK were often built separately. We have the username in the default $OPT string for personal builds, so the HOTSPOT_BUILD_USER is just redundant information. There is no reason to have the build user recorded in non personal builds. > > Thanks Erik. I'll leave this open a little longer for others to comment, but I also can't see a reason why this would be needed, especially in non-personal builds. @gnu-andrew You might also want to draw specific attention to this change on hotspot-dev, to reach all developers who might know or care about the user name. It is worth making sure to point out that the big change here is that the hotspot version string format changes; that worries me more than the removal of the user name. (In fact, a safer change might be to hardcode the user name to a dummy value like `openjdk`). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1981393909 From mli at openjdk.org Wed Mar 6 17:51:13 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 6 Mar 2024 17:51:13 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v7] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? > Thanks! > > ## Test > > test/jdk/java/lang/Float/Binary16ConversionNaN.java > test/jdk/java/lang/Float/Binary16Conversion.java > > hotspot/jtreg/compiler/intrinsics/float16 > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: minor improvements ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17698/files - new: https://git.openjdk.org/jdk/pull/17698/files/bf5c1239..def78b12 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17698&range=05-06 Stats: 28 lines in 3 files changed: 4 ins; 5 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/17698.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17698/head:pull/17698 PR: https://git.openjdk.org/jdk/pull/17698 From mli at openjdk.org Wed Mar 6 17:51:14 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 6 Mar 2024 17:51:14 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v6] In-Reply-To: References: <0vM0lVtB3-SUUb31C0jXhWampvXUwYhuqG9GslvGkgE=.ddce2df0-f504-4060-921e-caf8ec507197@github.com> Message-ID: <-wzqfVm215ejon2-NUixSM75nPAP3H62t8NLjI99DxE=.2a566183-fdd8-450e-92b8-3523e508eb22@github.com> On Wed, 6 Mar 2024 06:12:31 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> filter tests with zvfh > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1964: > >> 1962: >> 1963: auto stub = C2CodeStub::make >> 1964: (dst, src, tmp, length, 36, float_to_float16_v_slow_path); > > You might want to reduce the stub size from 36 to 28 after the stub changes you made. Thanks for detailed review! All comments resolved. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1514917662 From shade at openjdk.org Wed Mar 6 18:03:46 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Mar 2024 18:03:46 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: <0xgoI47fdqsPHnjiCtHKO72SeGA_PGu3CAgBYZs-gBc=.8d390c92-1642-4dfd-8527-4b4b36715915@github.com> References: <0xgoI47fdqsPHnjiCtHKO72SeGA_PGu3CAgBYZs-gBc=.8d390c92-1642-4dfd-8527-4b4b36715915@github.com> Message-ID: On Wed, 6 Mar 2024 17:17:28 GMT, Magnus Ihse Bursie wrote: > I agree that it is good to get rid of this. However, the reason it has stuck around for so long is, eh, that it has been around for so long, so it is a bit unclear what or who could be relying on this. The example I know is to get a signal that test infrastructures actually picked up my ad-hoc binary build, not some other build from somewhere else. Maybe we should revisit the `git describe` hash idea we kicked along for a while now: https://bugs.openjdk.org/browse/JDK-8274980?focusedId=14452017 -- which would cover that case too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1981482991 From matsaave at openjdk.org Wed Mar 6 18:48:47 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 6 Mar 2024 18:48:47 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Sat, 2 Mar 2024 01:18:06 GMT, Ioi Lam wrote: > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. Looks good, thanks! Just one comment on format below. src/hotspot/share/cds/cdsConfig.hpp line 72: > 70: static void enable_dumping_dynamic_archive() { CDS_ONLY(_is_dumping_dynamic_archive = true); } > 71: static void disable_dumping_dynamic_archive() { CDS_ONLY(_is_dumping_dynamic_archive = false); } > 72: static bool is_using_archive() NOT_CDS_RETURN_(false); Could you fix the alignment of the method names here? ------------- Marked as reviewed by matsaave (Committer). PR Review: https://git.openjdk.org/jdk/pull/18095#pullrequestreview-1920532000 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1514980583 From kbarrett at openjdk.org Wed Mar 6 20:10:48 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 6 Mar 2024 20:10:48 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v4] In-Reply-To: References: Message-ID: <8f6C2JW402f50nWbhbqiAk5IiAwPeHLwvFYx0MmFd2Y=.5278a8d6-cca6-49d7-bde6-8b452ed9b2d9@github.com> On Wed, 6 Mar 2024 13:02:01 GMT, Suchismith Roy wrote: >> Remove redefinition of NULL for XLC compiler >> In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). >> Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. >> >> >> JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > Remove null check Remember to update copyrights in the changed files. src/hotspot/share/utilities/globalDefinitions_xlc.hpp line 96: > 94: // NULL vs NULL_WORD: > 95: // Some platform/tool-chain combinations can't assign NULL to an integer > 96: // type so we define NULL_WORD to use in those contexts. For xlc they are the same. The last sentence of the comment is no longer true. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18064#pullrequestreview-1920715467 PR Review Comment: https://git.openjdk.org/jdk/pull/18064#discussion_r1515092953 From coleen.phillimore at oracle.com Wed Mar 6 21:10:19 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 6 Mar 2024 16:10:19 -0500 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva Message-ID: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. Matias is a member of the HotSpot runtime team at Oracle. He has rewritten how we store resolution of fields, methods and indy expressions in the interpreter which will be helpful to the Valhalla project. He is currently working on enhancements to Class Data Sharing (CDS) and optimizations in the interpreter to support the Leyden project. He has contributed over 30 changes [3] to the HotSpot JVM over the past year, and lower-case-r reviewed many others in the runtime area. Votes are due by March 20, 2024. Only current JDK Reviewers [1] are eligible to vote on this nomination.? Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Coleen Phillimore [1] https://openjdk.org/census [2] https://openjdk.org/projects/#reviewer-vote [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Wed Mar 6 21:12:17 2024 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 6 Mar 2024 13:12:17 -0800 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: <31038e86-1728-458d-b0bf-556d00154b0d@oracle.com> Vote: yes Best regards, Vladimir Ivanov On 3/6/24 13:10, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. From matsaave at openjdk.org Wed Mar 6 21:34:06 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 6 Mar 2024 21:34:06 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v3] In-Reply-To: References: Message-ID: > The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. > > This implementation has two core methods: > `copy_of_range` allocates and returns a new word array as used internally by BitMap > `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. > > This change is verified with an included test. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Ioi comments and moved tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18092/files - new: https://git.openjdk.org/jdk/pull/18092/files/95683209..17ac26f2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=01-02 Stats: 413 lines in 4 files changed: 190 ins; 222 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18092/head:pull/18092 PR: https://git.openjdk.org/jdk/pull/18092 From iklam at openjdk.org Wed Mar 6 22:23:05 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 22:23:05 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v2] In-Reply-To: References: Message-ID: > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: fixed alignments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18095/files - new: https://git.openjdk.org/jdk/pull/18095/files/78e828de..ae0e0acc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=00-01 Stats: 39 lines in 1 file changed: 17 ins; 7 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18095.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18095/head:pull/18095 PR: https://git.openjdk.org/jdk/pull/18095 From iklam at openjdk.org Wed Mar 6 22:23:05 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 22:23:05 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v2] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 18:42:01 GMT, Matias Saavedra Silva wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed alignments > > src/hotspot/share/cds/cdsConfig.hpp line 72: > >> 70: static void enable_dumping_dynamic_archive() { CDS_ONLY(_is_dumping_dynamic_archive = true); } >> 71: static void disable_dumping_dynamic_archive() { CDS_ONLY(_is_dumping_dynamic_archive = false); } >> 72: static bool is_using_archive() NOT_CDS_RETURN_(false); > > Could you fix the alignment of the method names here? Since several people are confused by the alignment style (align same words to the right), I fixed the grouping of the functions so that the text is aligned to the left, as in most header files. Please take a look at [ae0e0ac](https://github.com/openjdk/jdk/pull/18095/commits/ae0e0acc72bcf1bba81dc4218a64766eb2f2549a) -- it's best viewed on GutHub with white spaces hidden. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1515229517 From kbarrett at openjdk.org Wed Mar 6 22:45:53 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 6 Mar 2024 22:45:53 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v3] In-Reply-To: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: <4gH4gS9Nb8dw8UEI2qeJXOr1M29miIy5o_h9ISeZixg=.df04cb23-1092-426c-850f-651b8dc99a0c@github.com> > Please review this change to update the HotSpot Style Guide's discussion of > nullptr and its use. > > I suggest this is an editorial rather than substantive change to the style > guide. As such, the normal HotSpot PR process can be used for this change. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into nullptr-style - respond to shipilev comments - update nullptr usage ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18101/files - new: https://git.openjdk.org/jdk/pull/18101/files/13dc01ac..461bf280 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18101&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18101&range=01-02 Stats: 72989 lines in 714 files changed: 2663 ins; 68821 del; 1505 mod Patch: https://git.openjdk.org/jdk/pull/18101.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18101/head:pull/18101 PR: https://git.openjdk.org/jdk/pull/18101 From kbarrett at openjdk.org Wed Mar 6 22:45:53 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 6 Mar 2024 22:45:53 GMT Subject: RFR: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL [v3] In-Reply-To: References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: On Tue, 5 Mar 2024 18:16:51 GMT, Vladimir Kozlov wrote: >> Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into nullptr-style >> - respond to shipilev comments >> - update nullptr usage > > Approved. Thanks for reviews @vnkozlov , @kimbarrett , and @dholmes-ora . ------------- PR Comment: https://git.openjdk.org/jdk/pull/18101#issuecomment-1981979030 From kbarrett at openjdk.org Wed Mar 6 22:45:54 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 6 Mar 2024 22:45:54 GMT Subject: Integrated: 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL In-Reply-To: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> References: <0bjOXsg3qgtMkCB8yupUrv_8T4ooW5I-c4J74M1WZpM=.cd47c740-e9b7-4bcc-b047-25e0e2025884@github.com> Message-ID: <4__tRpAP_1sl9PS-EKm59Rz2ZcRzo4qQW0keC_ASMtU=.e46426c0-628a-45b1-b478-e4ba16f12234@github.com> On Mon, 4 Mar 2024 08:38:09 GMT, Kim Barrett wrote: > Please review this change to update the HotSpot Style Guide's discussion of > nullptr and its use. > > I suggest this is an editorial rather than substantive change to the style > guide. As such, the normal HotSpot PR process can be used for this change. This pull request has now been integrated. Changeset: 3d37b286 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/3d37b28642fd2715ab5c365426409c61939a1436 Stats: 14 lines in 2 files changed: 5 ins; 0 del; 9 mod 8327173: HotSpot Style Guide needs update regarding nullptr vs NULL Reviewed-by: shade, dholmes, kvn ------------- PR: https://git.openjdk.org/jdk/pull/18101 From dlong at openjdk.org Wed Mar 6 23:11:55 2024 From: dlong at openjdk.org (Dean Long) Date: Wed, 6 Mar 2024 23:11:55 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. OK, that makes sense about loom and JOM. src/hotspot/share/runtime/mutex.cpp line 537: > 535: // can be called by jvmti by VMThread. > 536: if (current->is_Java_thread()) { > 537: _sem.wait_with_safepoint_check(JavaThread::cast(current)); Why not use PlatformMutex + OSThreadWaitState instead of a semaphore? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17739#issuecomment-1982008253 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515269443 From iklam at openjdk.org Wed Mar 6 23:36:09 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 23:36:09 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v3] In-Reply-To: References: Message-ID: > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: more alignment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18095/files - new: https://git.openjdk.org/jdk/pull/18095/files/ae0e0acc..5a28f865 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=01-02 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18095.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18095/head:pull/18095 PR: https://git.openjdk.org/jdk/pull/18095 From iklam at openjdk.org Wed Mar 6 23:46:52 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Mar 2024 23:46:52 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v3] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 21:34:06 GMT, Matias Saavedra Silva wrote: >> The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. >> >> This implementation has two core methods: >> `copy_of_range` allocates and returns a new word array as used internally by BitMap >> `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. >> >> This change is verified with an included test. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Ioi comments and moved tests LGTM ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18092#pullrequestreview-1921045324 From coleenp at openjdk.org Wed Mar 6 23:49:54 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 6 Mar 2024 23:49:54 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 23:09:34 GMT, Dean Long wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > src/hotspot/share/runtime/mutex.cpp line 537: > >> 535: // can be called by jvmti by VMThread. >> 536: if (current->is_Java_thread()) { >> 537: _sem.wait_with_safepoint_check(JavaThread::cast(current)); > > Why not use PlatformMutex + OSThreadWaitState instead of a semaphore? Semaphore seemed simpler (?) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515294583 From coleen.phillimore at oracle.com Wed Mar 6 23:54:53 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 6 Mar 2024 18:54:53 -0500 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes On 3/6/24 4:10 PM, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has > rewritten how we store resolution of fields, methods and indy > expressions in the interpreter which will be helpful to the Valhalla > project. He is currently working on enhancements to Class Data Sharing > (CDS) and optimizations in the interpreter to support the Leyden > project. He has contributed over 30 changes [3] to the HotSpot JVM > over the past year, and lower-case-r reviewed many others in the > runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this > nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dlong at openjdk.org Thu Mar 7 00:18:56 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Mar 2024 00:18:56 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 23:47:01 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/mutex.cpp line 537: >> >>> 535: // can be called by jvmti by VMThread. >>> 536: if (current->is_Java_thread()) { >>> 537: _sem.wait_with_safepoint_check(JavaThread::cast(current)); >> >> Why not use PlatformMutex + OSThreadWaitState instead of a semaphore? > > Semaphore seemed simpler (?) OK. It's a good thing HotSpot doesn't need to worry about priority-inheritance for mutexes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515314037 From dlong at openjdk.org Thu Mar 7 00:22:56 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Mar 2024 00:22:56 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. src/hotspot/share/oops/arrayKlass.cpp line 135: > 133: ResourceMark rm(THREAD); > 134: { > 135: // Ensure atomic creation of higher dimensions Isn't this comment still useful? src/hotspot/share/oops/arrayKlass.cpp line 144: > 142: ObjArrayKlass* ak = > 143: ObjArrayKlass::allocate_objArray_klass(class_loader_data(), dim + 1, this, CHECK_NULL); > 144: ak->set_lower_dimension(this); Would it be useful to assert that the lower dimension has been set? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515322667 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515323404 From dlong at openjdk.org Thu Mar 7 00:34:57 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Mar 2024 00:34:57 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. src/hotspot/share/oops/instanceKlass.cpp line 2765: > 2763: // To get a consistent list of classes we need MultiArray_lock to ensure > 2764: // array classes aren't observed while they are being restored. > 2765: MutexLocker ml(MultiArray_lock); Doesn't this revert JDK-8274338? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515330292 From dlong at openjdk.org Thu Mar 7 00:39:56 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Mar 2024 00:39:56 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: <7p3I5oGNwzuUK3dMUapit_wKX3uMQo3H9FxGzEAiaOQ=.c9b2f9c5-d473-42ee-bb63-9de3f37515d3@github.com> On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1921100707 From dholmes at openjdk.org Thu Mar 7 00:56:55 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Mar 2024 00:56:55 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 00:16:39 GMT, Dean Long wrote: >> Semaphore seemed simpler (?) > > OK. It's a good thing HotSpot doesn't need to worry about priority-inheritance for mutexes. Semaphore seems simpler/cleaner and ready to use. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515343108 From coleenp at openjdk.org Thu Mar 7 01:38:56 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 01:38:56 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Dean's comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17739/files - new: https://git.openjdk.org/jdk/pull/17739/files/3625e6cc..d64aa560 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17739&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17739&range=00-01 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17739/head:pull/17739 PR: https://git.openjdk.org/jdk/pull/17739 From coleenp at openjdk.org Thu Mar 7 01:38:57 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 01:38:57 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 01:35:45 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Dean's comments. Thanks for your review Dean. I've retested tier1 with the changes and will send it for more. Thank you for finding the CDS bug in my change. ------------- PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1921122133 From coleenp at openjdk.org Thu Mar 7 01:38:58 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 01:38:58 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 00:18:53 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Dean's comments. > > src/hotspot/share/oops/arrayKlass.cpp line 135: > >> 133: ResourceMark rm(THREAD); >> 134: { >> 135: // Ensure atomic creation of higher dimensions > > Isn't this comment still useful? Yes, it's a useful comment, readded. > src/hotspot/share/oops/arrayKlass.cpp line 144: > >> 142: ObjArrayKlass* ak = >> 143: ObjArrayKlass::allocate_objArray_klass(class_loader_data(), dim + 1, this, CHECK_NULL); >> 144: ak->set_lower_dimension(this); > > Would it be useful to assert that the lower dimension has been set? Yes, added. > src/hotspot/share/oops/instanceKlass.cpp line 2765: > >> 2763: // To get a consistent list of classes we need MultiArray_lock to ensure >> 2764: // array classes aren't observed while they are being restored. >> 2765: MutexLocker ml(MultiArray_lock); > > Doesn't this revert JDK-8274338? You're right. I didn't believe my own comment when I removed this. Thank you for pointing out the bug. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515346198 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515346716 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515349878 From coleenp at openjdk.org Thu Mar 7 01:38:58 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 01:38:58 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 00:54:30 GMT, David Holmes wrote: >> OK. It's a good thing HotSpot doesn't need to worry about priority-inheritance for mutexes. > > Semaphore seems simpler/cleaner and ready to use. It's such a rare race and unusual condition that we could execute more Java code, that I started out with just a shared bit. Then David suggested a semaphore that obeys the safepoint protocol, which seems a lot better. I've literally skimmed over OSThreadWaitState. It looks like Semaphore::wait_with_a_safepoint_check() uses it. I still don't know why it exists: // Note: the ThreadState is legacy code and is not correctly implemented. // Uses of ThreadState need to be replaced by the state in the JavaThread. enum ThreadState { Does a PlatformMutex handle priority-inheritance? It still feels like it would be overkill for this usage. I hope this RecursiveLocker does not become more widely used, where we would have to replace the simple implementation with something more difficult. We should discourage further use when possible. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515364466 From dholmes at openjdk.org Thu Mar 7 01:47:03 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Mar 2024 01:47:03 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 01:38:56 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Dean's comments. src/hotspot/share/oops/instanceKlass.cpp line 1552: > 1550: RecursiveLocker rl(MultiArray_lock, THREAD); > 1551: > 1552: // This thread is the creator. This thread may not be the creator. The original comment was more apt - we typically say something like "Check if another thread beat us to it." ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515371571 From dlong at openjdk.org Thu Mar 7 02:11:01 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 7 Mar 2024 02:11:01 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 01:38:56 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Dean's comments. > Does a PlatformMutex handle priority-inheritance? It would depend on the OS and the mutex impementation. You can ignore my comment. It was from long ago trying to put Java on top of a real-time OS. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17739#issuecomment-1982207873 From dholmes at openjdk.org Thu Mar 7 02:50:58 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Mar 2024 02:50:58 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 01:38:56 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Dean's comments. This looks good to me. Thanks for working through the details with me. src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp line 121: > 119: { > 120: // To get a consistent list of classes we need MultiArray_lock to ensure > 121: // array classes aren't created during this walk. This walks through the Just curious but how can walking the set of classes trigger creation of array classes? src/hotspot/share/runtime/mutexLocker.hpp line 335: > 333: > 334: // Instance of a RecursiveLock that may be held through Java heap allocation, which may include calls to Java, > 335: // and JNI event notification for resource exhausted for metaspace or heap. s/exhausted/exhaustion/ ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1921323899 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515402581 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1515406082 From fyang at openjdk.org Thu Mar 7 03:00:55 2024 From: fyang at openjdk.org (Fei Yang) Date: Thu, 7 Mar 2024 03:00:55 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v7] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 17:51:13 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? >> Thanks! >> >> ## Test >> >> test/jdk/java/lang/Float/Binary16ConversionNaN.java >> test/jdk/java/lang/Float/Binary16Conversion.java >> >> hotspot/jtreg/compiler/intrinsics/float16 >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava >> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > minor improvements Updated change looks good assuming it still passes the related tests. test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java line 31: > 29: * @requires (os.simpleArch == "x64" & (vm.cpu.features ~= ".*avx512f.*" | vm.cpu.features ~= ".*f16c.*")) | > 30: * os.arch == "aarch64" | > 31: * (os.arch == "riscv64" & vm.cpu.features ~= ".*zvfh.*") A small question: I see you used following condition when doing the scalar version: `(os.arch == "riscv64" & vm.cpu.features ~= ".*zfh,.*")` There is comma there for matching cpu features. Do we really need this comma? It is not there for this vector version. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17698#pullrequestreview-1921347292 PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1515411542 From iklam at openjdk.org Thu Mar 7 03:57:52 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 7 Mar 2024 03:57:52 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v3] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 23:44:01 GMT, Ioi Lam wrote: > LGTM BTW, the title of the PR should be changed to drop the word "slice", as it's no used in the code at all. Also, we are not returning a copy of a bitmap, but rather truncating the contents of the current bitmap. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18092#issuecomment-1982303121 From dholmes at openjdk.org Thu Mar 7 06:57:56 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Mar 2024 06:57:56 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v3] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 23:36:09 GMT, Ioi Lam wrote: >> A few clean ups: >> >> 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. >> >> 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. >> >> 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. >> >> 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. >> >> 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > more alignment Some drive-by nits. (But copyright needs fixing.) src/hotspot/share/cds/cdsConfig.cpp line 2: > 1: /* > 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. Incorrect copyright update - should be "2023, 2024," src/hotspot/share/cds/cdsConfig.cpp line 54: > 52: (is_dumping_static_archive() ? IS_DUMPING_STATIC_ARCHIVE : 0) | > 53: (is_logging_lambda_form_invokers() ? IS_LOGGING_LAMBDA_FORM_INVOKERS : 0) | > 54: (is_using_archive() ? IS_USING_ARCHIVE : 0); You can remove one space before the ? in each line src/hotspot/share/cds/cdsConfig.hpp line 56: > 54: static const int IS_DUMPING_STATIC_ARCHIVE = 1 << 1; > 55: static const int IS_LOGGING_LAMBDA_FORM_INVOKERS = 1 << 2; > 56: static const int IS_USING_ARCHIVE = 1 << 3; why is the = sign so far away? ------------- PR Review: https://git.openjdk.org/jdk/pull/18095#pullrequestreview-1921571088 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1515626568 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1515628561 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1515629728 From dholmes at openjdk.org Thu Mar 7 07:14:54 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 7 Mar 2024 07:14:54 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. The internal version is reported in crash reports (hs_err file) as well as via `-Xinternalversion`. My recollection is that the username was needed/used to identify whether a hs_err report came from an official build, back when official builds were created via a very specific build process and a specific user name. These days it seems redundant given the way the rest of the version string is constructed both for "official" builds and personal builds. I can't see any tests that look at the output from `-Xinternalversion`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1982679336 From aboldtch at openjdk.org Thu Mar 7 07:46:55 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 7 Mar 2024 07:46:55 GMT Subject: RFR: 8327093: Add slice function to BitMap API [v3] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 21:34:06 GMT, Matias Saavedra Silva wrote: >> The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. >> >> This implementation has two core methods: >> `copy_of_range` allocates and returns a new word array as used internally by BitMap >> `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. >> >> This change is verified with an included test. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Ioi comments and moved tests The implementation looks correct now. But there is a mix of styles with regards to const that I think should be consistent at least inside the PR. It is not really possible to unify completely with the rest of the file as there is a mix of `const` usage. The actual style is not that important to me, but mixing styles with `const` makes me question the intent of code. I would also have much more confidence in the implementation (and for capturing regressions) if this test could be added. diff --git a/test/hotspot/gtest/utilities/test_bitMap.cpp b/test/hotspot/gtest/utilities/test_bitMap.cpp index e1fc744f07a..2dcb5cfef26 100644 --- a/test/hotspot/gtest/utilities/test_bitMap.cpp +++ b/test/hotspot/gtest/utilities/test_bitMap.cpp @@ -148,6 +148,38 @@ class BitMapTruncateTest { EXPECT_TRUE(map.is_same(result)); } + template + static void testRandom() { + for (int i = 0; i < 100; i++) { + ResourceMark rm; + + const size_t max_size = 1024; + const size_t size = os::random() % max_size + 1; + const size_t truncate_size = os::random() % size + 1; + const size_t truncate_start = size == truncate_size ? 0 : os::random() % (size - truncate_size); + + ResizableBitMapClass map(size); + ResizableBitMapClass result(truncate_size); + + for (BitMap::idx_t idx = 0; idx < truncate_start; idx++) { + if (os::random() % 2 == 0) { + map.set_bit(idx); + } + } + + for (BitMap::idx_t idx = 0; idx < truncate_size; idx++) { + if (os::random() % 2 == 0) { + map.set_bit(truncate_start + idx); + result.set_bit(idx); + } + } + + map.truncate(truncate_start, truncate_start + truncate_size); + + EXPECT_TRUE(map.is_same(result)); + } + } + template static void testTruncateSame() { // Resulting map should be the same as the original @@ -394,3 +426,12 @@ TEST_VM(BitMap, truncate_one_word) { BitMapTruncateTest::testTruncateOneWord(); EXPECT_FALSE(HasFailure()) << "Failed on type ArenaBitMap"; } + +TEST_VM(BitMap, truncate_random) { + BitMapTruncateTest::testRandom(); + EXPECT_FALSE(HasFailure()) << "Failed on type ResourceBitMap"; + BitMapTruncateTest::testRandom(); + EXPECT_FALSE(HasFailure()) << "Failed on type CHeapBitMap"; + BitMapTruncateTest::testRandom(); + EXPECT_FALSE(HasFailure()) << "Failed on type ArenaBitMap"; +} Also agree with @iklam that the PR title should not mention `slice` but `truncate` src/hotspot/share/utilities/bitMap.cpp line 110: > 108: idx_t const cutoff = bit_in_word(end_bit); > 109: idx_t const start_word = to_words_align_down(start_bit); > 110: idx_t const end_word = to_words_align_up(end_bit); Using right CV-qualifiers is very rare in HotSpot. I think there are around 600 uses and 1/3 is in the G1 code. I do not mind either style, but this PR mixes them, and this file already uses left CV-qualifiers. It would probably be best to keep this consistent. src/hotspot/share/utilities/bitMap.cpp line 111: > 109: idx_t const start_word = to_words_align_down(start_bit); > 110: idx_t const end_word = to_words_align_up(end_bit); > 111: bm_word_t* const old_map = map(); The pointed to type should probably also be `const`. This code should not touch the old memory. src/hotspot/share/utilities/bitMap.cpp line 113: > 111: bm_word_t* const old_map = map(); > 112: > 113: BitMapWithAllocator* derived = static_cast(this); Variable could be `const` (just like `old_map` above) src/hotspot/share/utilities/bitMap.cpp line 115: > 113: BitMapWithAllocator* derived = static_cast(this); > 114: > 115: bm_word_t* new_map = derived->allocate(end_word - start_word); Variable could be `const` src/hotspot/share/utilities/bitMap.cpp line 120: > 118: idx_t const shift = bit_in_word(start_bit); > 119: // Bits shifted out by a word need to be passed into the next > 120: idx_t carry = 0; `carry` should be of type `bm_word_t` it contains bits of the payload, not an index. src/hotspot/share/utilities/bitMap.cpp line 141: > 139: bm_word_t* const old_map = map(); > 140: > 141: bm_word_t* new_map = copy_of_range(start_bit, end_bit); Variable could be `const` src/hotspot/share/utilities/bitMap.cpp line 143: > 141: bm_word_t* new_map = copy_of_range(start_bit, end_bit); > 142: > 143: BitMapWithAllocator* derived = static_cast(this); Variable could be `const` ------------- Changes requested by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18092#pullrequestreview-1921642108 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515669936 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515670841 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515672496 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515672703 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515673268 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515674184 PR Review Comment: https://git.openjdk.org/jdk/pull/18092#discussion_r1515674256 From duke at openjdk.org Thu Mar 7 08:06:28 2024 From: duke at openjdk.org (Yude Lin) Date: Thu, 7 Mar 2024 08:06:28 GMT Subject: RFR: 8327522: Shenandoah: Remove unused references to satb_mark_queue_active_offset Message-ID: Removed an unused variable (trivial) Also, there is another place that uses satb_mark_queue_active_offset which is ShenandoahBarrierSetC2::verify_gc_barriers The current barrier pattern is different from what this code is expecting: If->Bool->CmpI->AndI->LoadUB->AddP->ConL(gc_state_offset) rather than If->Bool->CmpI->LoadB->AddP->ConL(marking_offset) However, this code isn't doing as much checking as its counterpart in G1 anyway (so I'm thinking removing the incorrect matching code altogether?) Looking forward to your suggestions. ------------- Commit messages: - 8327522: Shenandoah: Remove unused references to satb_mark_queue_active_offset Changes: https://git.openjdk.org/jdk/pull/18148/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18148&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327522 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18148.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18148/head:pull/18148 PR: https://git.openjdk.org/jdk/pull/18148 From sroy at openjdk.org Thu Mar 7 08:13:52 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Thu, 7 Mar 2024 08:13:52 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v5] In-Reply-To: References: Message-ID: > Remove redefinition of NULL for XLC compiler > In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). > Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. > > > JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: update comment and copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18064/files - new: https://git.openjdk.org/jdk/pull/18064/files/60bdf794..ec40c344 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=03-04 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18064.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18064/head:pull/18064 PR: https://git.openjdk.org/jdk/pull/18064 From richard.reingruber at sap.com Thu Mar 7 08:25:32 2024 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 7 Mar 2024 08:25:32 +0000 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes Richard. From: hotspot-dev on behalf of coleen.phillimore at oracle.com Date: Wednesday, 6. March 2024 at 22:10 To: hotspot-dev at openjdk.org Subject: CFV: New JDK Reviewer: Matias Saveedra Silva I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. Matias is a member of the HotSpot runtime team at Oracle. He has rewritten how we store resolution of fields, methods and indy expressions in the interpreter which will be helpful to the Valhalla project. He is currently working on enhancements to Class Data Sharing (CDS) and optimizations in the interpreter to support the Leyden project. He has contributed over 30 changes [3] to the HotSpot JVM over the past year, and lower-case-r reviewed many others in the runtime area. Votes are due by March 20, 2024. Only current JDK Reviewers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Coleen Phillimore [1] https://openjdk.org/census [2] https://openjdk.org/projects/#reviewer-vote [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bulasevich at openjdk.org Thu Mar 7 10:58:19 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 7 Mar 2024 10:58:19 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v9] In-Reply-To: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: > These changes clean up the logic and the code of allocating codecache segments and add more testing of it, to open a door for further optimization of code cache segmentation. The goal was to keep the behavior as close to the existing behavior as possible, even if it's not quite logical. > > Also, these changes better account for alignment - PrintFlagsFinal shows the final aligned segment sizes, and the segments fill the ReservedCodeCacheSize without gaps caused by alignment. Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - minor updates according to review comments - minor update: align_up for ReservedCodeCacheSize - one another cleanup round - minor update. removed helper function as it caused many comments in the review - set_size_of_unset_code_heap - minor update - apply suggestions - cleanup & test udpdate - 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17244/files - new: https://git.openjdk.org/jdk/pull/17244/files/d70aa718..b39dffdf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17244&range=07-08 Stats: 206276 lines in 4329 files changed: 71595 ins; 106446 del; 28235 mod Patch: https://git.openjdk.org/jdk/pull/17244.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17244/head:pull/17244 PR: https://git.openjdk.org/jdk/pull/17244 From clanger at openjdk.org Thu Mar 7 10:59:55 2024 From: clanger at openjdk.org (Christoph Langer) Date: Thu, 7 Mar 2024 10:59:55 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v5] In-Reply-To: References: Message-ID: <75NwTbUgqmzlTev8WYb24a6b8-_Rol6rj18WJEhv2bc=.872bca0d-3597-4511-87c1-d3278856e455@github.com> On Thu, 7 Mar 2024 08:13:52 GMT, Suchismith Roy wrote: >> Remove redefinition of NULL for XLC compiler >> In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). >> Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. >> >> >> JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > update comment and copyright Changes requested by clanger (Reviewer). src/hotspot/share/utilities/globalDefinitions_xlc.hpp line 2: > 1: /* > 2: * Copyright (c) 1998, 2023, Oracle and/or its affiliates. All rights reserved. You need to update the year for the Oracle copyright. Adding additional copyright lines, such as IBM, is usually only done when you substantially change a file, i.e. add larger non-trivial portions of code. ------------- PR Review: https://git.openjdk.org/jdk/pull/18064#pullrequestreview-1922127498 PR Review Comment: https://git.openjdk.org/jdk/pull/18064#discussion_r1515961111 From tschatzl at openjdk.org Thu Mar 7 11:21:12 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 7 Mar 2024 11:21:12 GMT Subject: RFR: 8289822: Make concurrent mark code owner of TAMSes Message-ID: Hi all, please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. This also moves some code that is about marking decisions out of `HeapRegion`. In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. I did not want to change this here. This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. Testing: tier1-4, gha Thanks, Thomas ------------- Commit messages: - Do the mark information reset more fine grained (like before this change) to not potentially make HeapRegion::hr_clear() too slow - Fixes - some attempts to not have TAMSes always be updated for all regions - Remove _top_at_mark_start HeapRegion member - 8326781 Changes: https://git.openjdk.org/jdk/pull/18150/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8289822 Stats: 197 lines in 16 files changed: 75 ins; 68 del; 54 mod Patch: https://git.openjdk.org/jdk/pull/18150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18150/head:pull/18150 PR: https://git.openjdk.org/jdk/pull/18150 From bulasevich at openjdk.org Thu Mar 7 11:25:55 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 7 Mar 2024 11:25:55 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v8] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> Message-ID: <4-qdyQh_Ea011a-GAx7vaOIns98kB-G-LLfgROqaf0A=.b4b5a5f8-d08b-47fa-832d-f49841adce92@github.com> On Tue, 20 Feb 2024 23:12:01 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: >> >> one another cleanup round > > src/hotspot/share/code/codeCache.cpp line 197: > >> 195: }; >> 196: >> 197: static void set_size_of_unset_code_heap(CodeHeapInfo* heap, size_t available_size, size_t known_segments_size, size_t min_size) { > > Suggest renaming: `known_segments_size` -> `used_size`. ok. thanks > src/hotspot/share/code/codeCache.cpp line 227: > >> 225: if (!heap_available(CodeBlobType::MethodNonProfiled)) { >> 226: assert(false, "MethodNonProfiled heap is always available for segmented code heap"); >> 227: } > > It's just: > > assert(heap_available(CodeBlobType::MethodNonProfiled), "MethodNonProfiled heap must be always available for segmented code heap"); yes ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1515992672 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1515993528 From bulasevich at openjdk.org Thu Mar 7 11:25:57 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 7 Mar 2024 11:25:57 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v7] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_HrFkS2z3Mk1RKtgVB010rsIx7aTolc_0iMDJox6eDc=.de2d7093-186f-48c0-859f-0769bf372b29@github.com> Message-ID: On Mon, 19 Feb 2024 20:38:22 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: >> >> minor update. removed helper function as it caused many comments in the review > > src/hotspot/share/code/codeCache.cpp line 316: > >> 314: // Note: if large page support is enabled, min_size is at least the large >> 315: // page size. This ensures that the code cache is covered by large pages. >> 316: non_nmethod.size = align_up(non_nmethod.size, min_size); > > A potential problem here: code heap sizes are `min_size` aligned but `cache_size` might be not. I added an alignment. thanks > src/hotspot/share/code/codeCache.cpp line 336: > >> 334: // Tier 2 and tier 3 (profiled) methods >> 335: add_heap(profiled_space, "CodeHeap 'profiled nmethods'", CodeBlobType::MethodProfiled); >> 336: } > > A redundant check: the non-profiled code heap is always enabled. Redundant check makes no harm. More questions would be raised without the check. Let me leave the check (this and one below) and provide a comment. if (profiled.enabled) { check_min_size("profiled code heap", profiled.size, min_size); } if (non_profiled.enabled) { // non_profiled.enabled is always ON for segmented code heap, leave it checked for clarity check_min_size("non-profiled code heap", non_profiled.size, min_size); } if (cache_size_set) { check_min_size("reserved code cache", cache_size, min_cache_size); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1515993234 PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1515996268 From bulasevich at openjdk.org Thu Mar 7 11:28:56 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 7 Mar 2024 11:28:56 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v7] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_HrFkS2z3Mk1RKtgVB010rsIx7aTolc_0iMDJox6eDc=.de2d7093-186f-48c0-859f-0769bf372b29@github.com> Message-ID: On Tue, 20 Feb 2024 23:06:16 GMT, Evgeny Astigeevich wrote: >> Let us assume that with the previous calculations we have this layout: >> - {cache_size=18004KB, non_profiled.size = 6002KB, profiled.size = 6002KB, non_nmethod.size=6000KB} >> >> With next align_down for profiled and non_profiled segments, we get: >> - {cache_size=18004KB, non_profiled.size = 6000KB, profiled.size = 6000KB, non_nmethod.size=6000KB} >> >> My intention here is to put 4KB into the non_nmethod segment to make the sum of the segment sizes equal to cache_size: >> - {cache_size=18004KB, non_profiled.size = 6000KB, profiled.size = 6000KB, non_nmethod.size=6004KB} > > Why has `non_nmethod` been chosen? I have not seen it running out of space. It usually has a lot of free space. Also in case of large pages, e.g. 2M, it can get even more space than needed. > I'd rather give additional space to `profiled` or `non_profiled`. As `profiled` is optional, to simplify the code we can have: > > non_nmethod.size = align_up(non_nmethod.size, min_size); > profiled.size = align_down(profiled.size, min_size); > profiled.size = align_down(cache_size - profiled.size - non_nmethod.size, min_size); > > > For you example, we will have: > > {cache_size=18004KB, non_profiled.size = 6004KB, profiled.size = 6000KB, non_nmethod.size=6000KB} My reasons are: the logic was before the change and we know non_nmethod segment always exist. Although, ok, let it be non_profiled segment to get an extra load. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1516000632 From adinn at redhat.com Thu Mar 7 11:29:41 2024 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 7 Mar 2024 11:29:41 +0000 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: <2493d63f-b2bb-5da2-bb9d-e7bcb7121dd6@redhat.com> Vote: yes On 06/03/2024 21:10, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has > rewritten how we store resolution of fields, methods and indy > expressions in the interpreter which will be helpful to the Valhalla > project. He is currently working on enhancements to Class Data Sharing > (CDS) and optimizations in the interpreter to support the Leyden > project. He has contributed over 30 changes [3] to the HotSpot JVM over > the past year, and lower-case-r reviewed many others in the runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -- regards, Andrew Dinn ----------- Red Hat Distinguished Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From bulasevich at openjdk.org Thu Mar 7 11:34:57 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 7 Mar 2024 11:34:57 GMT Subject: RFR: 8311248: Refactor CodeCache::initialize_heaps to simplify adding new CodeCache segments [v7] In-Reply-To: References: <-V_dgQKhDrt1n0Zbk3qNa276jZBO822NlSvUv9AhpEA=.7a23c712-4b01-4aa7-acbe-8f5b5cca9002@github.com> <_HrFkS2z3Mk1RKtgVB010rsIx7aTolc_0iMDJox6eDc=.de2d7093-186f-48c0-859f-0769bf372b29@github.com> Message-ID: On Mon, 19 Feb 2024 20:20:55 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: >> >> minor update. removed helper function as it caused many comments in the review > > src/hotspot/share/code/codeCache.cpp line 243: > >> 241: if (!profiled.set && !non_profiled.set) { >> 242: non_profiled.size = profiled.size = (cache_size > non_nmethod.size + 2 * min_size) ? >> 243: (cache_size - non_nmethod.size) / 2 : min_size; > > I prefer to show explicitly with `else if` that it is an alternative case. This prevents from inserting any code between these two IFs. I see. I want to have minimal dependencies in between subsequent lines. Let me keep my variant. Thanks, ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17244#discussion_r1516007432 From mli at openjdk.org Thu Mar 7 12:24:02 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 7 Mar 2024 12:24:02 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v2] In-Reply-To: References: Message-ID: On Mon, 26 Feb 2024 07:30:12 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> add some comments > > Seems that the spec is not in ratified status yet? Thanks @RealFYang @luhenry for your reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17698#issuecomment-1983395798 From mli at openjdk.org Thu Mar 7 12:24:03 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 7 Mar 2024 12:24:03 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v7] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 02:54:43 GMT, Fei Yang wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> minor improvements > > test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java line 31: > >> 29: * @requires (os.simpleArch == "x64" & (vm.cpu.features ~= ".*avx512f.*" | vm.cpu.features ~= ".*f16c.*")) | >> 30: * os.arch == "aarch64" | >> 31: * (os.arch == "riscv64" & vm.cpu.features ~= ".*zvfh.*") > > A small question: I see you used following condition when doing the scalar version: > `(os.arch == "riscv64" & vm.cpu.features ~= ".*zfh,.*")` > There is comma there for matching cpu features. Do we really need this comma? It is not there for this vector version. Seems not necessary currently, unless in the future there is other extension starting with `zfh`, e.g. `zfhX`, but seems that chance is low IMHO. And, I think it might not work for the last extension in the CPU feature string if it's with a comma (as the cpu feature string is converted from a list to string, so the last one will not ending with a`,` ? ), so for new and long named extensions, it should be without a comma. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1516063293 From mli at openjdk.org Thu Mar 7 12:24:03 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 7 Mar 2024 12:24:03 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v7] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 12:19:48 GMT, Hamlin Li wrote: >> test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java line 31: >> >>> 29: * @requires (os.simpleArch == "x64" & (vm.cpu.features ~= ".*avx512f.*" | vm.cpu.features ~= ".*f16c.*")) | >>> 30: * os.arch == "aarch64" | >>> 31: * (os.arch == "riscv64" & vm.cpu.features ~= ".*zvfh.*") >> >> A small question: I see you used following condition when doing the scalar version: >> `(os.arch == "riscv64" & vm.cpu.features ~= ".*zfh,.*")` >> There is comma there for matching cpu features. Do we really need this comma? It is not there for this vector version. > > Seems not necessary currently, unless in the future there is other extension starting with `zfh`, e.g. `zfhX`, but seems that chance is low IMHO. > And, I think it might not work for the last extension in the CPU feature string if it's with a comma (as the cpu feature string is converted from a list to string, so the last one will not ending with a`,` ? ), so for new and long named extensions, it should be without a comma. Thanks for testing! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1516064255 From mli at openjdk.org Thu Mar 7 12:24:04 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 7 Mar 2024 12:24:04 GMT Subject: Integrated: 8320646: RISC-V: C2 VectorCastHF2F In-Reply-To: References: Message-ID: On Sun, 4 Feb 2024 09:38:48 GMT, Hamlin Li wrote: > Hi, > Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF? > Thanks! > > ## Test > > test/jdk/java/lang/Float/Binary16ConversionNaN.java > test/jdk/java/lang/Float/Binary16Conversion.java > > hotspot/jtreg/compiler/intrinsics/float16 > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava > hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java This pull request has now been integrated. Changeset: d7273ac8 Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/d7273ac8b1ad8bc5d0a17fff5dc941c735fdae24 Stats: 378 lines in 9 files changed: 371 ins; 0 del; 7 mod 8320646: RISC-V: C2 VectorCastHF2F 8320647: RISC-V: C2 VectorCastF2HF Reviewed-by: luhenry, fyang ------------- PR: https://git.openjdk.org/jdk/pull/17698 From coleenp at openjdk.org Thu Mar 7 13:21:09 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 13:21:09 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v3] In-Reply-To: References: Message-ID: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - Missed a word. - David's comment fixes. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17739/files - new: https://git.openjdk.org/jdk/pull/17739/files/d64aa560..88fd6f13 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17739&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17739&range=01-02 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/17739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17739/head:pull/17739 PR: https://git.openjdk.org/jdk/pull/17739 From coleenp at openjdk.org Thu Mar 7 13:21:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 13:21:10 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 01:38:56 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Dean's comments. David, thank you for your discussion while fixing this problem. ------------- PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1922394063 From coleenp at openjdk.org Thu Mar 7 13:21:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 13:21:10 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v2] In-Reply-To: References: Message-ID: <1nz_mknyZvGdeVoetWZvFXsagFjJ0p5odDt-Rf_C4pc=.1b9598ef-0b61-4e00-8522-8ad797f1e685@github.com> On Thu, 7 Mar 2024 01:44:22 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Dean's comments. > > src/hotspot/share/oops/instanceKlass.cpp line 1552: > >> 1550: RecursiveLocker rl(MultiArray_lock, THREAD); >> 1551: >> 1552: // This thread is the creator. > > This thread may not be the creator. The original comment was more apt - we typically say something like "Check if another thread beat us to it." Fixed. // Check if another thread created the array klass while we were waiting for the lock. > src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp line 121: > >> 119: { >> 120: // To get a consistent list of classes we need MultiArray_lock to ensure >> 121: // array classes aren't created during this walk. This walks through the > > Just curious but how can walking the set of classes trigger creation of array classes? It can't. I'll reword that comment so it's clearer, with just a couple of words. // To get a consistent list of classes we need MultiArray_lock to ensure // array classes aren't created by another thread this walk. This walks through the // InstanceKlass::_array_klasses links. edit: added "during" to "by another thread during this walk" > src/hotspot/share/runtime/mutexLocker.hpp line 335: > >> 333: >> 334: // Instance of a RecursiveLock that may be held through Java heap allocation, which may include calls to Java, >> 335: // and JNI event notification for resource exhausted for metaspace or heap. > > s/exhausted/exhaustion/ Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1516126218 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1516128248 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1516130552 From coleenp at openjdk.org Thu Mar 7 13:21:10 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 13:21:10 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v3] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 01:31:09 GMT, Coleen Phillimore wrote: >> Semaphore seems simpler/cleaner and ready to use. > > It's such a rare race and unusual condition that we could execute more Java code, that I started out with just a shared bit. Then David suggested a semaphore that obeys the safepoint protocol, which seems a lot better. I've literally skimmed over OSThreadWaitState. It looks like Semaphore::wait_with_a_safepoint_check() uses it. I still don't know why it exists: > > > // Note: the ThreadState is legacy code and is not correctly implemented. > // Uses of ThreadState need to be replaced by the state in the JavaThread. > > enum ThreadState { > > > Does a PlatformMutex handle priority-inheritance? It still feels like it would be overkill for this usage. I hope this RecursiveLocker does not become more widely used, where we would have to replace the simple implementation with something more difficult. We should discourage further use when possible. Thanks for your last comment because I was worried there's a lot of other code I should know about. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1516129405 From volker.simonis at gmail.com Thu Mar 7 15:39:34 2024 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 7 Mar 2024 16:39:34 +0100 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes On Wed, Mar 6, 2024 at 10:10?PM wrote: > > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has rewritten how we store resolution of fields, methods and indy expressions in the interpreter which will be helpful to the Valhalla project. He is currently working on enhancements to Class Data Sharing (CDS) and optimizations in the interpreter to support the Leyden project. He has contributed over 30 changes [3] to the HotSpot JVM over the past year, and lower-case-r reviewed many others in the runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org From calvin.cheung at oracle.com Thu Mar 7 16:34:52 2024 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Thu, 7 Mar 2024 08:34:52 -0800 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: <01506e03-b3ed-46f8-9fd8-71bf71e258a5@oracle.com> ??? Vote: yes On 3/6/24 1:10 PM, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has > rewritten how we store resolution of fields, methods and indy > expressions in the interpreter which will be helpful to the Valhalla > project. He is currently working on enhancements to Class Data Sharing > (CDS) and optimizations in the interpreter to support the Leyden > project. He has contributed over 30 changes [3] to the HotSpot JVM > over the past year, and lower-case-r reviewed many others in the > runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this > nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ihse at openjdk.org Thu Mar 7 17:09:54 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 7 Mar 2024 17:09:54 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: <0xgoI47fdqsPHnjiCtHKO72SeGA_PGu3CAgBYZs-gBc=.8d390c92-1642-4dfd-8527-4b4b36715915@github.com> Message-ID: <_VYHW1CbGYLclRcXgqUcbViq_jZ2TygGmv0Woq49zXI=.30df5674-40f9-4264-849c-61ddacb0b7b2@github.com> On Wed, 6 Mar 2024 18:01:00 GMT, Aleksey Shipilev wrote: > > I agree that it is good to get rid of this. However, the reason it has stuck around for so long is, eh, that it has been around for so long, so it is a bit unclear what or who could be relying on this. > > The example I know is to get a signal that test infrastructures actually picked up my ad-hoc binary build, not some other build from somewhere else. Maybe we should revisit the `git describe` hash idea we kicked along for a while now: https://bugs.openjdk.org/browse/JDK-8274980?focusedId=14452017 -- which would cover that case too. That bug was resolved with a different patch; presumably it conflated several issues. There is an inherent conflict with creating a version string that is very much up-to-date and includes ephemeral build data, and creating a robustly reproducible build. If this patch were to remove HOTSPOT_BUILD_USER, and we would later on store `git describe` in the version string, I'm not sure we actually gained anything in terms of reproducability. :-( As I see it, given this PR, we have a few different options: 1) We can agree that stable and robust reproducibility trumps any automatic inclusion of build situation metadata, and accept this PR and drop the idea of using `git describe` 2) We can agree that reproducibility has its limits, and we need to include (technically irrelevant) metadata in builds to facilitate development. This means dropping this PR since it is used to verify that the correct build was tested, and possibly also adding `git describe` later on. 3) We can agree to disagree about reproducibility limits, but accept that HOTSPOT_BUILD_USER is silly, and accept this PR, but open a separate JBS issue to discuss if adding `git describe` is desirable, and if it can be done in an opt-in manner, with the understanding that it might not be added if we can't agree on it. 4) We can agree to disagree about reproducibility limits, but accept that HOTSPOT_BUILD_USER is silly, and accept this PR, but require that we first implement a substitute functionality using `git describe`, before we can get rid of HOTSPOT_BUILD_USER. My personal preference would be 3) above, but I am willing to accept any of the paths. @gnu-andrew and @shipilev, what do you think? I recon you are the one most opinionated about this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984021748 From jesper.wilhelmsson at oracle.com Thu Mar 7 17:12:44 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Thu, 7 Mar 2024 17:12:44 +0000 Subject: =?utf-8?B?UmU6IENGVjogTmV3IEhvdFNwb3QgR3JvdXAgTWVtYmVyOiBKb2hhbiBTasO2?= =?utf-8?B?bMOpbg==?= In-Reply-To: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> References: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> Message-ID: <58BC17F3-DBCF-4BE9-9F10-ED55F2DE4045@oracle.com> Vote: Yes /Jesper > On Feb 22, 2024, at 13:14, Jesper Wilhelmsson wrote: > > I hereby nominate Johan Sj?l?n (jsjolen) to Membership in the HotSpot Group. > > Johan is a Reviewer in the JDK project, and a member of the Oracle JVM Runtime team. Johan has done major work in the Unified Logging framework and is today the de facto owner of that area. He has done major cleanups and bug fixes throughout the JVM, and does a lot of work in the Native Memory Tracking area. > > Votes are due by Thursday March 7. > > Only current Members of the HotSpot Group [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Thanks, > /Jesper > > [1] https://openjdk.org/census > [2] https://openjdk.org/groups/#member-vote -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From shade at openjdk.org Thu Mar 7 17:13:56 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Mar 2024 17:13:56 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. I am okay with either (3) or (4). That is, I agree `HOTSPOT_BUILD_USER` is silly. I think we can survive without `git describe` thing for a while too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984028130 From ihse at openjdk.org Thu Mar 7 17:13:56 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 7 Mar 2024 17:13:56 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. Also, it might be worth repeating one of my long-standing wishes: that the version string should not be hard-coded into the build, but e.g. stored as a string in the `release` file, and read from there. If we did that, the cost of changing the version string would be negligible, and we wouldn't need to worry as much about it. It would also be simple to compare different builds which end up with the same bits since they are built from the same sources, but by different version flags (e.g. -ea vs GA). (In fact, we'd turn a -ea build into a GA just by updating the version string, so we'd know for sure we are publishing what we tested.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984028842 From fthevenet at openjdk.org Thu Mar 7 17:34:55 2024 From: fthevenet at openjdk.org (Frederic Thevenet) Date: Thu, 7 Mar 2024 17:34:55 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 17:11:18 GMT, Magnus Ihse Bursie wrote: > Also, it might be worth repeating one of my long-standing wishes: that the version string should not be hard-coded into the build, but e.g. stored as a string in the `release` file, and read from there. If we did that, the cost of changing the version string would be negligible, and we wouldn't need to worry as much about it. It would also be simple to compare different builds which end up with the same bits since they are built from the same sources, but by different version flags (e.g. -ea vs GA). (In fact, we'd turn a -ea build into a GA just by updating the version string, so we'd know for sure we are publishing what we tested.) +1 This would be a great step in making comparability (beyond reproducibility) of builds of OpenJDK at lot simpler. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984065312 From lois.foltan at oracle.com Thu Mar 7 17:56:41 2024 From: lois.foltan at oracle.com (Lois Foltan) Date: Thu, 7 Mar 2024 17:56:41 +0000 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes Lois From: hotspot-dev on behalf of coleen.phillimore at oracle.com Date: Wednesday, March 6, 2024 at 4:10?PM To: hotspot-dev at openjdk.org Subject: CFV: New JDK Reviewer: Matias Saveedra Silva I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. Matias is a member of the HotSpot runtime team at Oracle. He has rewritten how we store resolution of fields, methods and indy expressions in the interpreter which will be helpful to the Valhalla project. He is currently working on enhancements to Class Data Sharing (CDS) and optimizations in the interpreter to support the Leyden project. He has contributed over 30 changes [3] to the HotSpot JVM over the past year, and lower-case-r reviewed many others in the runtime area. Votes are due by March 20, 2024. Only current JDK Reviewers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Coleen Phillimore [1] https://openjdk.org/census [2] https://openjdk.org/projects/#reviewer-vote [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From roberto.castaneda.lozano at oracle.com Thu Mar 7 18:28:47 2024 From: roberto.castaneda.lozano at oracle.com (Roberto Castaneda Lozano) Date: Thu, 7 Mar 2024 18:28:47 +0000 Subject: =?iso-8859-1?Q?Re:_CFV:_New_HotSpot_Group_Member:_Johan_Sj=F6l=E9n?= In-Reply-To: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> References: <041D6781-CFE6-41BD-A8F5-7E25C2EFFD52@oracle.com> Message-ID: Vote: yes ________________________________________ From: hotspot-dev on behalf of Jesper Wilhelmsson Sent: Thursday, February 22, 2024 1:14 PM To: hotspot-dev at openjdk.org Subject: CFV: New HotSpot Group Member: Johan Sj?l?n I hereby nominate Johan Sj?l?n (jsjolen) to Membership in the HotSpot Group. Johan is a Reviewer in the JDK project, and a member of the Oracle JVM Runtime team. Johan has done major work in the Unified Logging framework and is today the de facto owner of that area. He has done major cleanups and bug fixes throughout the JVM, and does a lot of work in the Native Memory Tracking area. Votes are due by Thursday March 7. Only current Members of the HotSpot Group [1] are eligible to vote on this nomination.? Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Thanks, /Jesper [1] https://openjdk.org/census [2] https://openjdk.org/groups/#member-vote From johan.sjolen at oracle.com Thu Mar 7 19:09:21 2024 From: johan.sjolen at oracle.com (Johan Sjolen) Date: Thu, 7 Mar 2024 19:09:21 +0000 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes ________________________________ From: hotspot-dev on behalf of coleen.phillimore at oracle.com Sent: Wednesday, March 6, 2024 22:10 To: hotspot-dev at openjdk.org Subject: CFV: New JDK Reviewer: Matias Saveedra Silva I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. Matias is a member of the HotSpot runtime team at Oracle. He has rewritten how we store resolution of fields, methods and indy expressions in the interpreter which will be helpful to the Valhalla project. He is currently working on enhancements to Class Data Sharing (CDS) and optimizations in the interpreter to support the Leyden project. He has contributed over 30 changes [3] to the HotSpot JVM over the past year, and lower-case-r reviewed many others in the runtime area. Votes are due by March 20, 2024. Only current JDK Reviewers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [2]. Coleen Phillimore [1] https://openjdk.org/census [2] https://openjdk.org/projects/#reviewer-vote [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at openjdk.org Thu Mar 7 19:37:55 2024 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 7 Mar 2024 19:37:55 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: <_VYHW1CbGYLclRcXgqUcbViq_jZ2TygGmv0Woq49zXI=.30df5674-40f9-4264-849c-61ddacb0b7b2@github.com> References: <0xgoI47fdqsPHnjiCtHKO72SeGA_PGu3CAgBYZs-gBc=.8d390c92-1642-4dfd-8527-4b4b36715915@github.com> <_VYHW1CbGYLclRcXgqUcbViq_jZ2TygGmv0Woq49zXI=.30df5674-40f9-4264-849c-61ddacb0b7b2@github.com> Message-ID: <-NDzF1_xs06l1V1ZhUbWvJtNNolSByasBTwrAiszrtQ=.1550d475-a8a2-4211-b5d3-c18b93f54e5a@github.com> On Thu, 7 Mar 2024 17:07:05 GMT, Magnus Ihse Bursie wrote: > There is an inherent conflict with creating a version string that is very much up-to-date and includes ephemeral build data, and creating a robustly reproducible build. If this patch were to remove HOTSPOT_BUILD_USER, and we would later on store `git describe` in the version string, I'm not sure we actually gained anything in terms of reproducability. :-( It's worth remembering that there are essentially two potential forms of version string at play; those for "adhoc" builds and those for release builds, which depends on whether `NO_DEFAULT_VERSION_PARTS` ends up being set to `true` or not. This change does nothing to remove the username used in the "adhoc" version of `--with-version-opt` i.e. `adhoc.$USERNAME.$basedirname`: >From the patched build: ~~~ $ /home/andrew/builder/23/images/jdk/bin/java -version openjdk version "23-internal" 2024-09-17 OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.andrew.jdk) OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.andrew.jdk, mixed mode, sharing) $ /home/andrew/builder/23/images/jdk/bin/java -Xinternalversion OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 ~~~ What we aim to get rid of is the duplicate username copy which trickles into the `-Xinternalversion` via `HOTSPOT_BUILD_USER`. From an unpatched build of 22: ~~~ $ /home/andrew/build/openjdk22/bin/java -Xinternalversion OpenJDK 64-Bit Server VM (22.0.1-internal-adhoc.andrew.jdk) for linux-amd64 JRE (22.0.1-internal-adhoc.andrew.jdk), built on 2024-03-05T19:52:35Z by "andrew" with gcc 13.2.1 20230826 ~~~ The only difference between the two is my username is there twice, instead of thrice. The problem with the `HOTSPOT_BUILD_USER` one is that it also turns up when you do a build for release: ~~~ $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) ~~~ Because `NO_DEFAULT_VERSION_PARTS` has been set, the two instances of the username from the adhoc build string are gone. However, the one from `HOTSPOT_BUILD_USER` being set to `mockbuild` still remains. Regarding the conflict between very accurate version information and reproducibility, it is possible to support both scenarios by having "adhoc" builds provide one and release builds the other. The problem with `HOTSPOT_BUILD_USER` as it stands is it ends up in both. I don't see an issue with including `git describe` output in adhoc builds. It may well not even be a problem for reproducibility in release builds as they are likely built from a set tag without local changes or even from a bare source tree without repository information at all. It does seem like a separate issue from this PR though. > 3. We can agree to disagree about reproducibility limits, but accept that HOTSPOT_BUILD_USER is silly, and accept this PR, but open a separate JBS issue to discuss if adding `git describe` is desirable, and if it can be done in an opt-in manner, with the understanding that it might not be added if we can't agree on it. > > > My personal preference would be 3) above, but I am willing to accept any of the paths. > > @gnu-andrew and @shipilev, what do you think? I recon you are the one most opinionated about this. I would go for 3. As I say, I don't see why this change would depend on a `git describe` change. Would you still prefer to hardcode a value for `HOTSPOT_BUILD_USER` or remove it altogether? I have a slight preference for the cleanliness of removing it altogether, especially as this is early in the 23 development cycle, so there is some time to see how it pans out. Alternatively, we can hardcode it to `unknown` (the current default if `HOTSPOT_BUILD_USER is undefined). Not only is this accurate (we don't know who the user was as we haven't recorded the information) but it is already a possible value for compatibility. Changing to this would simply be a matter of removing only the `#ifndef` in `abstract_vm_version.cpp` and leaving the actual version string alone. Looking further ahead, it would be preferable for us to backport this to 21, as that is where we are trying to achieve comparability between Red Hat & Temurin builds, but there is no rush for that and we could even hardcode it only in the backport. As regards testing, I have built this in release and fastdebug builds and checked the output where it is used in `-Xinternalversion` and the `vm info` line of a crash dump. Like David, I can't see any tests that pertain to this. The only ones that mention `-Xinternalversion` are `test/hotspot/jtreg/runtime/CommandLine/TestNullTerminatedFlags.java` and `test/hotspot/jtreg/sanity/BasicVMTest.java` and both of those only check the option works, not its output. This seems correct for something that is *internal* version data. To me, relying on a certain structure for this is something I would regard as a bug. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984286305 From patricio.chilano.mateo at oracle.com Thu Mar 7 19:42:40 2024 From: patricio.chilano.mateo at oracle.com (Patricio Chilano Mateo) Date: Thu, 7 Mar 2024 14:42:40 -0500 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes On Wed, Mar 6, 2024 at 10:10?PM wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has rewritten how we store resolution of fields, methods and indy expressions in the interpreter which will be helpful to the Valhalla project. He is currently working on enhancements to Class Data Sharing (CDS) and optimizations in the interpreter to support the Leyden project. He has contributed over 30 changes [3] to the HotSpot JVM over the past year, and lower-case-r reviewed many others in the runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org From andrew at openjdk.org Thu Mar 7 19:55:55 2024 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 7 Mar 2024 19:55:55 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 17:32:02 GMT, Frederic Thevenet wrote: > Also, it might be worth repeating one of my long-standing wishes: that the version string should not be hard-coded into the build, but e.g. stored as a string in the `release` file, and read from there. If we did that, the cost of changing the version string would be negligible, and we wouldn't need to worry as much about it. It would also be simple to compare different builds which end up with the same bits since they are built from the same sources, but by different version flags (e.g. -ea vs GA). (In fact, we'd turn a -ea build into a GA just by updating the version string, so we'd know for sure we are publishing what we tested.) This certainly sounds like it has the potential to solve a lot of these kind of problems. I would point out that, if you can flip the EA status in the text file, someone could also easily masquerade a build as something completely different from what it is. However, it is already possible to create a build like this via the `configure` options so it's really only a slight change in accessibility. For example, if I specify `--with-version-feature=11`, I can produce: ~~~ $ /home/andrew/builder/fake11/images/jdk/bin/java -version openjdk version "11-internal" 2024-09-17 OpenJDK Runtime Environment (fastdebug build 11-internal-adhoc.andrew.jdk) OpenJDK 64-Bit Server VM (fastdebug build 11-internal-adhoc.andrew.jdk, mixed mode, sharing) $ /home/andrew/builder/fake11/images/jdk/bin/java -Xinternalversion OpenJDK 64-Bit Server VM (fastdebug 11-internal-adhoc.andrew.jdk) for linux-amd64 JRE (11-internal-adhoc.andrew.jdk), built on 2024-03-07T19:41:08Z with gcc 13.2.1 20230826 ~~~ despite the fact that the source code I've built is actually an in-development JDK 23. All we'd be doing is moving that into a text file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984317603 From erikj at openjdk.org Thu Mar 7 20:50:54 2024 From: erikj at openjdk.org (Erik Joelsson) Date: Thu, 7 Mar 2024 20:50:54 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. I agree with Andrew. We aren't removing the username from adhoc builds, just the copy in the hotspot internal version string, so the concern about that part seems baseless to me. I don't see the connection to the `git describe` discussion, I think that should move to a separate thread. I definitely don't think we need to add a fake HOTSPOT_BUILD_USER to the _internal_ hotspot version string. Lets just get rid of it by accepting this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1984415570 From iklam at openjdk.org Thu Mar 7 21:00:03 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 7 Mar 2024 21:00:03 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v3] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 06:50:04 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> more alignment > > src/hotspot/share/cds/cdsConfig.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2024, Oracle and/or its affiliates. All rights reserved. > > Incorrect copyright update - should be "2023, 2024," Fixed. > src/hotspot/share/cds/cdsConfig.cpp line 54: > >> 52: (is_dumping_static_archive() ? IS_DUMPING_STATIC_ARCHIVE : 0) | >> 53: (is_logging_lambda_form_invokers() ? IS_LOGGING_LAMBDA_FORM_INVOKERS : 0) | >> 54: (is_using_archive() ? IS_USING_ARCHIVE : 0); > > You can remove one space before the ? in each line Fixed. > src/hotspot/share/cds/cdsConfig.hpp line 56: > >> 54: static const int IS_DUMPING_STATIC_ARCHIVE = 1 << 1; >> 55: static const int IS_LOGGING_LAMBDA_FORM_INVOKERS = 1 << 2; >> 56: static const int IS_USING_ARCHIVE = 1 << 3; > > why is the = sign so far away? Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1516811557 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1516811622 PR Review Comment: https://git.openjdk.org/jdk/pull/18095#discussion_r1516811711 From iklam at openjdk.org Thu Mar 7 21:00:01 2024 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 7 Mar 2024 21:00:01 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v4] In-Reply-To: References: Message-ID: <17levG82EcbT5zXdxDurs1tNJ07sWRXxtZqBphDPGJE=.acf2e6c0-24fd-418d-ac03-96b2047c4837@github.com> > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora comments -- white spaces and copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18095/files - new: https://git.openjdk.org/jdk/pull/18095/files/5a28f865..c8f08f7f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=02-03 Stats: 13 lines in 3 files changed: 0 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18095.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18095/head:pull/18095 PR: https://git.openjdk.org/jdk/pull/18095 From coleenp at openjdk.org Thu Mar 7 21:05:55 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 7 Mar 2024 21:05:55 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v2] In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 13:07:02 GMT, Fredrik Bredberg wrote: >> The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: >> >> >> [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 >> [6.693s][trace][continuations] local 0 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 >> [6.693s][trace][continuations] - 3 locals 3 max stack >> [6.693s][trace][continuations] - codelet: return entry points >> [6.693s][trace][continuations] saved fp >> [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address >> [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 >> [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp >> [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) >> [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method >> [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp >> [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) >> [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache >> [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) >> [6.693s][trace][continuatio... > > Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8314508_relativized_ptr_frame_describe > - Small white space fix > - 8314508: Improve how relativized pointers are printed by frame::describe I don't know this code that well but I had a couple of questions. src/hotspot/share/runtime/frame.cpp line 1634: > 1632: st->print_cr(" %s %s %s", spacer, spacer, fv.description); > 1633: } else { > 1634: if (*fv.description == '#' && isdigit(fv.description[1])) { Is it possible for fv.description to be null? its sort of unclear from the context. There are several pointer dereferences here. Can you add a comment about what the format of this is? What is fv.description supposed to be? # And fp.location is the relativized offset, if fp.description is NOT #? This really does need some comment. And maybe a local variable to say what *fv.location is. And after testing the # output, the code still has to see if fv.location is between -100 and 100? Should it not just check fp != nullptr ? ------------- PR Review: https://git.openjdk.org/jdk/pull/18102#pullrequestreview-1923498105 PR Review Comment: https://git.openjdk.org/jdk/pull/18102#discussion_r1516797746 From fparain at openjdk.org Thu Mar 7 21:53:00 2024 From: fparain at openjdk.org (Frederic Parain) Date: Thu, 7 Mar 2024 21:53:00 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v3] In-Reply-To: References: Message-ID: <5QOZjKj7BrbeWDGgXDHbS3q75P5VEoC7wEcJVHRlzxM=.b24266e1-9490-4351-a876-afb3a54840e7@github.com> On Thu, 7 Mar 2024 13:21:09 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Missed a word. > - David's comment fixes. LGTM ------------- Marked as reviewed by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1923623357 From ioi.lam at oracle.com Thu Mar 7 22:41:02 2024 From: ioi.lam at oracle.com (ioi.lam at oracle.com) Date: Thu, 7 Mar 2024 14:41:02 -0800 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: Yes On 3/6/24 1:10 PM, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has > rewritten how we store resolution of fields, methods and indy > expressions in the interpreter which will be helpful to the Valhalla > project. He is currently working on enhancements to Class Data Sharing > (CDS) and optimizations in the interpreter to support the Leyden > project. He has contributed over 30 changes [3] to the HotSpot JVM > over the past year, and lower-case-r reviewed many others in the > runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this > nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccheung at openjdk.org Thu Mar 7 23:01:54 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 7 Mar 2024 23:01:54 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v4] In-Reply-To: <17levG82EcbT5zXdxDurs1tNJ07sWRXxtZqBphDPGJE=.acf2e6c0-24fd-418d-ac03-96b2047c4837@github.com> References: <17levG82EcbT5zXdxDurs1tNJ07sWRXxtZqBphDPGJE=.acf2e6c0-24fd-418d-ac03-96b2047c4837@github.com> Message-ID: <6FOFK8n6A5Kqpsg6BbEKTaaqiQgzQKfKflbIopBkD8A=.b58cff05-5f6b-46f7-ab55-1075dad877e7@github.com> On Thu, 7 Mar 2024 21:00:01 GMT, Ioi Lam wrote: >> A few clean ups: >> >> 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. >> >> 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. >> >> 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. >> >> 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. >> >> 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora comments -- white spaces and copyright Marked as reviewed by ccheung (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18095#pullrequestreview-1923734079 From jesper.wilhelmsson at oracle.com Fri Mar 8 01:04:37 2024 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 8 Mar 2024 01:04:37 +0000 Subject: =?utf-8?B?UmVzdWx0OiBOZXcgSG90U3BvdCBHcm91cCBNZW1iZXI6IEpvaGFuIFNqw7Zs?= =?utf-8?B?w6lu?= Message-ID: <0D109513-BF55-4730-B6EE-7743D301D809@oracle.com> The vote for Johan Sj?l?n [1] is now closed. Yes: 19 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. /Jesper [1] https://mail.openjdk.org/pipermail/hotspot-dev/2024-February/085102.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From sjayagond at openjdk.org Fri Mar 8 05:21:01 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Fri, 8 Mar 2024 05:21:01 GMT Subject: RFR: 8327652: S390x: Implements SLP support Message-ID: This PR Adds SIMD support on s390x. ------------- Commit messages: - Remove extra spcaes - Implements SIMD support on s390x Changes: https://git.openjdk.org/jdk/pull/18162/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327652 Stats: 1155 lines in 16 files changed: 1091 ins; 12 del; 52 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From jwaters at openjdk.org Fri Mar 8 06:03:55 2024 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 8 Mar 2024 06:03:55 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 19:53:35 GMT, Andrew John Hughes wrote: > Also, it might be worth repeating one of my long-standing wishes: that the version string should not be hard-coded into the build, but e.g. stored as a string in the `release` file, and read from there. If we did that, the cost of changing the version string would be negligible, and we wouldn't need to worry as much about it. It would also be simple to compare different builds which end up with the same bits since they are built from the same sources, but by different version flags (e.g. -ea vs GA). (In fact, we'd turn a -ea build into a GA just by updating the version string, so we'd know for sure we are publishing what we tested.) Why not store the version string inside HotSpot, and have it as the one source of truth for the version string so it doesn't need to be hardcoded in other places? A text file seems too easy to modify to set the version string to a rubbish value ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1985087876 From sspitsyn at openjdk.org Fri Mar 8 06:05:09 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 8 Mar 2024 06:05:09 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v8] In-Reply-To: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: > The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. > With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. > > The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. > > CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly > RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > Testing: > - tested with the mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge - review: updated test desciption for two modified tests - review: added new internal function JvmtiEnvBase::get_thread_or_vthread_state - review: replaced timed with timed-out in JDWP and JDI spec - Merge - Merge - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17944/files - new: https://git.openjdk.org/jdk/pull/17944/files/289f6ff7..ce8ae8a8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17944&range=06-07 Stats: 79955 lines in 822 files changed: 6246 ins; 71952 del; 1757 mod Patch: https://git.openjdk.org/jdk/pull/17944.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17944/head:pull/17944 PR: https://git.openjdk.org/jdk/pull/17944 From sspitsyn at openjdk.org Fri Mar 8 06:11:23 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 8 Mar 2024 06:11:23 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: - Merge - review: minor tweak in test description of ObjectMonitorUsage.java - review: addressed more comments on the fix and new test - rename after merge: jvmti_common.h to jvmti_common.hpp - Merge - review: update comment in threads.hpp - fix deadlock with carrier threads starvation in ObjectMonitorUsage test - resolve merge conflict for deleted file objmonusage003.cpp - fix a typo in libObjectMonitorUsage.cpp - fix potential sync gap in the test ObjectMonitorUsage - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 ------------- Changes: https://git.openjdk.org/jdk/pull/17680/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=22 Stats: 1161 lines in 15 files changed: 722 ins; 390 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From dholmes at openjdk.org Fri Mar 8 06:19:58 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 8 Mar 2024 06:19:58 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v3] In-Reply-To: References: Message-ID: <1rt397QolNEaI0Eixy7hupwNKI2-7Ioz9KvmNCkRs5Y=.fb1b903e-fc48-4ea1-ac1e-4f55fed7f54e@github.com> On Thu, 7 Mar 2024 13:21:09 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Missed a word. > - David's comment fixes. Marked as reviewed by dholmes (Reviewer). src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp line 121: > 119: { > 120: // To get a consistent list of classes we need MultiArray_lock to ensure > 121: // array classes aren't created by another thread during this walk. This walks through the Thanks for clarifying, though I'm not sure why would care given the set of classes could have changed by the time we return them anyway. ------------- PR Review: https://git.openjdk.org/jdk/pull/17739#pullrequestreview-1924097971 PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1517209655 From amitkumar at openjdk.org Fri Mar 8 06:30:00 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 8 Mar 2024 06:30:00 GMT Subject: RFR: 8327652: S390x: Implements SLP support In-Reply-To: References: Message-ID: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> On Fri, 8 Mar 2024 05:05:31 GMT, Sidraya Jayagond wrote: > This PR Adds SIMD support on s390x. Changes requested by amitkumar (Committer). src/hotspot/cpu/s390/assembler_s390.inline.hpp line 952: > 950: inline void Assembler::z_vaccq( VectorRegister v1, VectorRegister v2, VectorRegister v3) {z_vacc(v1, v2, v3, VRET_QW); } // vector element type 'Q' > 951: // > 952: Suggestion: src/hotspot/cpu/s390/s390.ad line 357: > 355: > 356: > 357: Suggestion: src/hotspot/cpu/s390/s390.ad line 760: > 758: > 759: > 760: Suggestion: src/hotspot/cpu/s390/s390.ad line 1209: > 1207: > 1208: // Between float regs & stack are the flags regs. > 1209: //assert(OptoReg::is_stack(reg) || reg < 64+64+128, "blow up if spilling flags"); Suggestion: src/hotspot/cpu/s390/s390.ad line 1296: > 1294: C2_MacroAssembler _masm(cbuf); > 1295: __ z_vst(Rsrc, > 1296: Address(Z_SP, 0, dst_offset)); Suggestion: __ z_vst(Rsrc, Address(Z_SP, 0, dst_offset)); src/hotspot/cpu/s390/s390.ad line 1304: > 1302: C2_MacroAssembler _masm(cbuf); > 1303: __ z_vl(Rdst, > 1304: Address(Z_SP, 0, src_offset)); Suggestion: __ z_vl(Rdst, Address(Z_SP, 0, src_offset)); src/hotspot/cpu/s390/s390.ad line 11185: > 11183: ins_pipe(pipe_class_dummy); > 11184: %} > 11185: instruct vsub4I_reg(vecX dst, vecX src1, vecX src2) %{ Suggestion: %} instruct vsub4I_reg(vecX dst, vecX src1, vecX src2) %{ src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 485: > 483: offset += reg_size; > 484: } > 485: assert(offset == frame_size_in_bytes, "consistency check"); Suggestion: #ifdef ASSERT assert(offset == frame_size_in_bytes, "consistency check"); #endif // ASSERT src/hotspot/cpu/s390/vm_version_s390.cpp line 108: > 106: FLAG_SET_ERGO(UseSFPV, true); > 107: } else { > 108: if (model_ix == 7 && UseSFPV) { use else-if ? src/hotspot/cpu/s390/vm_version_s390.cpp line 114: > 112: } > 113: } else { > 114: if (SuperwordUseVX) { use else if ? src/hotspot/share/adlc/output_c.cpp line 2366: > 2364: if (strcmp(rep_var,"$VectorSRegister") == 0) return "as_VectorSRegister"; > 2365: #endif > 2366: #if defined(S390) can't we use `#elif` here ? src/hotspot/share/opto/machnode.hpp line 146: > 144: } > 145: #endif > 146: #if defined(AARCH64) can't we use #elif here ? ------------- PR Review: https://git.openjdk.org/jdk/pull/18162#pullrequestreview-1924049036 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517180299 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517182113 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517182237 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517182442 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517178059 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517178206 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517183239 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517185503 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517214404 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517214506 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517188859 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1517189185 From fyang at openjdk.org Fri Mar 8 07:53:02 2024 From: fyang at openjdk.org (Fei Yang) Date: Fri, 8 Mar 2024 07:53:02 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v7] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 12:20:40 GMT, Hamlin Li wrote: >> Seems not necessary currently, unless in the future there is other extension starting with `zfh`, e.g. `zfhX`, but seems that chance is low IMHO. >> And, I think it might not work for the last extension in the CPU feature string if it's with a comma (as the cpu feature string is converted from a list to string, so the last one will not ending with a`,` ? ), so for new and long named extensions, it should be without a comma. > > Thanks for testing! > Seems not necessary currently, unless in the future there is other extension starting with `zfh`, e.g. `zfhX`, but seems that chance is low IMHO. And, I think it might not work for the last extension in the CPU feature string if it's with a comma (as the cpu feature string is converted from a list to string, so the last one will not ending with a`,` ? ), so for new and long named extensions, it should be without a comma. Yeah, that makes sense to me. I think we should remove this comma from conditions `(os.arch == "riscv64" & vm.cpu.features ~= ".*zfh,.*")` on jdk master. Can you fix that? Then we will have a more simpler rule when writing those conditions. That is, for single-character extensions like `c` or `v`, we need to add this comma. But for multi-character extensions like `zfh` or `zvfh`, we should not have it to avoid the issue you mentioned. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1517349407 From ihse at openjdk.org Fri Mar 8 08:10:53 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 08:10:53 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 13:53:06 GMT, Magnus Ihse Bursie wrote: > I am currently running tier 4-10 This is now done. A handful of tests failed, but all are due to transient environmental problems, known errors, or seemingly intermittent product errors -- none are due to symbol visibility. So I'd argue that this is one of the most well-tested PRs posted ever. :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1985232111 From jzhu at openjdk.org Fri Mar 8 08:26:52 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Fri, 8 Mar 2024 08:26:52 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 16:52:02 GMT, Stuart Monteith wrote: > Thanks, that helps - I can see you're saving/restoring the correct register lengths. Would it be possible to generate a testcase to test that registers are being saved/restored correctly? @stooart-mon I had previously thought about how to write a good test case, but I did not think of a good way at that time. Let me rethink how to handle this gracefully. Thanks :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-1985252710 From sroy at openjdk.org Fri Mar 8 09:07:23 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 8 Mar 2024 09:07:23 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v6] In-Reply-To: References: Message-ID: > Remove redefinition of NULL for XLC compiler > In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). > Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. > > > JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: update copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18064/files - new: https://git.openjdk.org/jdk/pull/18064/files/ec40c344..cc5f5bca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18064&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18064.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18064/head:pull/18064 PR: https://git.openjdk.org/jdk/pull/18064 From sroy at openjdk.org Fri Mar 8 09:07:23 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 8 Mar 2024 09:07:23 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v5] In-Reply-To: <75NwTbUgqmzlTev8WYb24a6b8-_Rol6rj18WJEhv2bc=.872bca0d-3597-4511-87c1-d3278856e455@github.com> References: <75NwTbUgqmzlTev8WYb24a6b8-_Rol6rj18WJEhv2bc=.872bca0d-3597-4511-87c1-d3278856e455@github.com> Message-ID: On Thu, 7 Mar 2024 10:57:25 GMT, Christoph Langer wrote: >> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: >> >> update comment and copyright > > src/hotspot/share/utilities/globalDefinitions_xlc.hpp line 2: > >> 1: /* >> 2: * Copyright (c) 1998, 2023, Oracle and/or its affiliates. All rights reserved. > > You need to update the year for the Oracle copyright. Adding additional copyright lines, such as IBM, is usually only done when you substantially change a file, i.e. add larger non-trivial portions of code. Ok. Thanks will keep that in mind. Updated the copyrights year. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18064#discussion_r1517436048 From fthevenet at openjdk.org Fri Mar 8 10:43:54 2024 From: fthevenet at openjdk.org (Frederic Thevenet) Date: Fri, 8 Mar 2024 10:43:54 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 19:53:35 GMT, Andrew John Hughes wrote: >>> Also, it might be worth repeating one of my long-standing wishes: that the version string should not be hard-coded into the build, but e.g. stored as a string in the `release` file, and read from there. If we did that, the cost of changing the version string would be negligible, and we wouldn't need to worry as much about it. It would also be simple to compare different builds which end up with the same bits since they are built from the same sources, but by different version flags (e.g. -ea vs GA). (In fact, we'd turn a -ea build into a GA just by updating the version string, so we'd know for sure we are publishing what we tested.) >> >> +1 >> This would be a great step in making comparability (beyond reproducibility) of builds of OpenJDK at lot simpler. > >> Also, it might be worth repeating one of my long-standing wishes: that the version string should not be hard-coded into the build, but e.g. stored as a string in the `release` file, and read from there. If we did that, the cost of changing the version string would be negligible, and we wouldn't need to worry as much about it. It would also be simple to compare different builds which end up with the same bits since they are built from the same sources, but by different version flags (e.g. -ea vs GA). (In fact, we'd turn a -ea build into a GA just by updating the version string, so we'd know for sure we are publishing what we tested.) > > This certainly sounds like it has the potential to solve a lot of these kind of problems. I would point out that, if you can flip the EA status in the text file, someone could also easily masquerade a build as something completely different from what it is. However, it is already possible to create a build like this via the `configure` options so it's really only a slight change in accessibility. > > For example, if I specify `--with-version-feature=11`, I can produce: > > ~~~ > $ /home/andrew/builder/fake11/images/jdk/bin/java -version > openjdk version "11-internal" 2024-09-17 > OpenJDK Runtime Environment (fastdebug build 11-internal-adhoc.andrew.jdk) > OpenJDK 64-Bit Server VM (fastdebug build 11-internal-adhoc.andrew.jdk, mixed mode, sharing) > > $ /home/andrew/builder/fake11/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 11-internal-adhoc.andrew.jdk) for linux-amd64 JRE (11-internal-adhoc.andrew.jdk), built on 2024-03-07T19:41:08Z with gcc 13.2.1 20230826 > ~~~ > > despite the fact that the source code I've built is actually an in-development JDK 23. All we'd be doing is moving that into a text file. I agree with @gnu-andrew : the version string reported by the JVM, as it stands today, is only really useful if taken in good faith. If we wanted - or needed - to make it temper-proof from an adversarial threat model standpoint, then we couldn't spare the need for a cryptographically strong solution (which by design would render any direct comparison impossible). Right now, it feels like we are kind of stuck somewhere in-between, incurring all of the drawbacks and none of the benefits. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1985459372 From mli at openjdk.org Fri Mar 8 12:00:59 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 8 Mar 2024 12:00:59 GMT Subject: RFR: 8320646: RISC-V: C2 VectorCastHF2F [v7] In-Reply-To: References: Message-ID: <6rbPSDFA5XRjT77h0g9tf6XsrM_1v3yi4s5472fXdAU=.3048fc76-b93c-4f68-9a62-17ddba2882e6@github.com> On Fri, 8 Mar 2024 07:50:29 GMT, Fei Yang wrote: >> Thanks for testing! > >> Seems not necessary currently, unless in the future there is other extension starting with `zfh`, e.g. `zfhX`, but seems that chance is low IMHO. And, I think it might not work for the last extension in the CPU feature string if it's with a comma (as the cpu feature string is converted from a list to string, so the last one will not ending with a`,` ? ), so for new and long named extensions, it should be without a comma. > > Yeah, that makes sense to me. I think we should remove this comma from conditions `(os.arch == "riscv64" & vm.cpu.features ~= ".*zfh,.*")` on jdk master. Can you fix that? Then we will have a more simpler rule when writing those conditions. That is, for single-character extensions like `c` or `v`, we need to add this comma. But for multi-character extensions like `zfh` or `zvfh`, we should not have it to avoid the issue you mentioned. Sure, I created https://bugs.openjdk.org/browse/JDK-8327689 to track it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1517626629 From coleenp at openjdk.org Fri Mar 8 13:31:02 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 8 Mar 2024 13:31:02 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v3] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 13:21:09 GMT, Coleen Phillimore wrote: >> This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. >> >> Passes tier1-7. > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - Missed a word. > - David's comment fixes. Thank you for reviewing, Dean, David and Fred. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17739#issuecomment-1985694821 From coleenp at openjdk.org Fri Mar 8 13:31:03 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 8 Mar 2024 13:31:03 GMT Subject: RFR: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock [v3] In-Reply-To: <1rt397QolNEaI0Eixy7hupwNKI2-7Ioz9KvmNCkRs5Y=.fb1b903e-fc48-4ea1-ac1e-4f55fed7f54e@github.com> References: <1rt397QolNEaI0Eixy7hupwNKI2-7Ioz9KvmNCkRs5Y=.fb1b903e-fc48-4ea1-ac1e-4f55fed7f54e@github.com> Message-ID: <2z1SHVNOF932nnHMAHkho9I5kQBT8ACK34-VxpbrmoQ=.2edd34b1-3249-4843-80c6-9b0052fcb28c@github.com> On Fri, 8 Mar 2024 06:17:06 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - Missed a word. >> - David's comment fixes. > > src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp line 121: > >> 119: { >> 120: // To get a consistent list of classes we need MultiArray_lock to ensure >> 121: // array classes aren't created by another thread during this walk. This walks through the > > Thanks for clarifying, though I'm not sure why would care given the set of classes could have changed by the time we return them anyway. I think in this case, we might find an ObjArrayKlass without the mirror since the Klass is added to the higher_dimension links before the mirror is created. The lock keeps them both together. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17739#discussion_r1517716370 From coleenp at openjdk.org Fri Mar 8 13:31:03 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 8 Mar 2024 13:31:03 GMT Subject: Integrated: 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock In-Reply-To: References: Message-ID: <6P09sTiNw7qF79ZSIn6fOlMFmFMwMXDM_vNahXuTUew=.fc116878-87a3-4d9a-8bfa-8ca7cfc41632@github.com> On Tue, 6 Feb 2024 22:59:04 GMT, Coleen Phillimore wrote: > This change creates a new sort of native recursive lock that can be held during JNI and Java calls, which can be used for synchronization while creating objArrayKlasses at runtime. > > Passes tier1-7. This pull request has now been integrated. Changeset: 1877a487 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/1877a4879598356b777742fe80bdd5fa77ca8e8d Stats: 165 lines in 11 files changed: 87 ins; 43 del; 35 mod 8308745: ObjArrayKlass::allocate_objArray_klass may call into java while holding a lock Reviewed-by: dlong, dholmes, fparain ------------- PR: https://git.openjdk.org/jdk/pull/17739 From fbredberg at openjdk.org Fri Mar 8 14:17:54 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Fri, 8 Mar 2024 14:17:54 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v2] In-Reply-To: References: Message-ID: <53LYpOs2rFf5LkZTXG_u8boy499KyYfbxmzRrl4iLdE=.b0eb2b64-9308-460f-8379-9b42b399ab58@github.com> On Thu, 7 Mar 2024 20:45:17 GMT, Coleen Phillimore wrote: >> Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into 8314508_relativized_ptr_frame_describe >> - Small white space fix >> - 8314508: Improve how relativized pointers are printed by frame::describe > > src/hotspot/share/runtime/frame.cpp line 1634: > >> 1632: st->print_cr(" %s %s %s", spacer, spacer, fv.description); >> 1633: } else { >> 1634: if (*fv.description == '#' && isdigit(fv.description[1])) { > > Is it possible for fv.description to be null? its sort of unclear from the context. There are several pointer dereferences here. > > Can you add a comment about what the format of this is? What is fv.description supposed to be? # > > And fp.location is the relativized offset, if fp.description is NOT #? > > This really does need some comment. And maybe a local variable to say what *fv.location is. > > And after testing the # output, the code still has to see if fv.location is between -100 and 100? Should it not just check fp != nullptr ? @coleenp I'll try to answer as good as I can. **Q1:** Is it possible for `fv.description` to be null? its sort of unclear from the context. There are several pointer dereferences here. **A1:** If `fv.description` ever had been null (before any of my changes) it would have lead to a crash, so in my world it is never null. **Q2:** Can you add a comment about what the format of this is? What is `fv.description` supposed to be? # **A2:** If you look at the stack frame in the description of this PR you'll see the contents of the `fv.description` after the `
: ` columns. The only `fv.description` string starting with a '#' is the line for the saved frame pointer. So, `#10 method java.lang.invoke.LambdaForm...` basicaly means frame 10. This is also how I find the frame pointer. **Q3:** This really does need some comment. And maybe a local variable to say what `*fv.location` is. **A3:** `fv.location` contains the `
:` columns value and `*fv.location` contains the `` columns value. However if the value at an address is relativized like 0x000000409e14cc20 in the example, the real content of address 0x000000409e14cc20 is 0x0000000000000003. So instead of showing the contents as 3 (which we used to do before my patch) we show the real derelativized pointer value which is 0x000000409e14cc88 and also add the text `(relativized: fp+3)` to try to explain to anyone that's confused by the fact that the real content of the stack address is 3. **Q4:** And after testing the # output, the code still has to see if fv.location is between -100 and 100? Should it not just check `fp != nullptr` ? **A4:** No. In order for us to believe that `*fv.location` is a fp-relative value it must be a reasonable small positive/negative number. Otherwise we would print all large pointer values as fp-relative. The oddball situation is that I've found frame pointer addresses pointing to a content which is within -100 and 100, with a `fv.description` string starting with a '#'. And those a not relativized values, and therefore should not be printed as such. Hence the `*fv.description != '#'` condition. Hope this shed some light on this code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18102#discussion_r1517771365 From ihse at openjdk.org Fri Mar 8 15:37:00 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:37:00 GMT Subject: RFR: 8327701: Remove the xlc toolchain Message-ID: As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. ------------- Commit messages: - 8327701: Remove the xlc toolchain Changes: https://git.openjdk.org/jdk/pull/18172/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327701 Stats: 333 lines in 19 files changed: 25 ins; 267 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/18172.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18172/head:pull/18172 PR: https://git.openjdk.org/jdk/pull/18172 From ihse at openjdk.org Fri Mar 8 15:37:02 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:37:02 GMT Subject: RFR: 8327701: Remove the xlc toolchain In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:29:58 GMT, Magnus Ihse Bursie wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. make/autoconf/flags-cflags.m4 line 687: > 685: PICFLAG="-fPIC" > 686: PIEFLAG="-fPIE" > 687: elif test "x$TOOLCHAIN_TYPE" = xclang && test "x$OPENJDK_TARGET_OS" = xaix; then Just a remark: This code has never been executed, since running with clang on any OS would hit the branch above. Also, the code is syntactically incorrect, missing a trailing `"`. make/autoconf/flags-cflags.m4 line 695: > 693: -D$FLAGS_CPU_LEGACY" > 694: > 695: if test "x$FLAGS_CPU_BITS" = x64; then A wise man said: "If the code and the comments contradict each other, then probably both are wrong". I am here assuming that the comment claiming that this is only needed by xlc is correct. If it turns out that this is needed even with clang on AIX, we'll have to restore the test and update the comment to this fact. make/autoconf/flags.m4 line 324: > 322: AC_DEFUN([FLAGS_SETUP_TOOLCHAIN_CONTROL], > 323: [ > 324: # COMPILER_TARGET_BITS_FLAG : option for selecting 32- or 64-bit output COMPILER_TARGET_BITS_FLAG and COMPILER_BINDCMD_FILE_FLAG was introduced just for xlc, so they are not needed anymore. COMPILER_COMMAND_FILE_FLAG was introduced to support ancient versions of gcc, and then used by xlc as well. It is not needed for gcc anymore, so I remove it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1517871818 PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1517873282 PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1517874782 From ihse at openjdk.org Fri Mar 8 15:40:57 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:40:57 GMT Subject: RFR: 8327701: Remove the xlc toolchain In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:29:58 GMT, Magnus Ihse Bursie wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. make/autoconf/toolchain.m4 line 420: > 418: # Remove "Thread model:" and further details from the version string, and > 419: # collapse into a single line > 420: COMPILER_VERSION_STRING=`$ECHO $COMPILER_VERSION_OUTPUT | \ These changes are not strictly needed, but it makes printing the clang version nicer, and compensates for the removal of the extra version information that was printed by some of the removed code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1517880241 From ihse at openjdk.org Fri Mar 8 15:48:08 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:48:08 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: Revert SEARCH_PATH changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18172/files - new: https://git.openjdk.org/jdk/pull/18172/files/9f4a059d..53a05019 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=00-01 Stats: 7 lines in 1 file changed: 0 ins; 5 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18172.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18172/head:pull/18172 PR: https://git.openjdk.org/jdk/pull/18172 From ihse at openjdk.org Fri Mar 8 15:48:08 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:48:08 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:44:48 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes make/autoconf/toolchain.m4 line 444: > 442: COMPILER_NAME=$2 > 443: SEARCH_LIST="$3" > 444: SEARCH_PATH="$PATH" I am note 100% sure about this; I think the intention of some of the old code amounted to this, but it was not entirely clear. But, then again, I think this is a good idea, and I think we should do this on *all* platforms -- search the toolchain path before the normal path. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1517884839 From ihse at openjdk.org Fri Mar 8 15:48:08 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:48:08 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:41:16 GMT, Magnus Ihse Bursie wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert SEARCH_PATH changes > > make/autoconf/toolchain.m4 line 444: > >> 442: COMPILER_NAME=$2 >> 443: SEARCH_LIST="$3" >> 444: SEARCH_PATH="$PATH" > > I am note 100% sure about this; I think the intention of some of the old code amounted to this, but it was not entirely clear. > > But, then again, I think this is a good idea, and I think we should do this on *all* platforms -- search the toolchain path before the normal path. Hm, as I write this, it strikes me as odd that we should not do this already. And of course we do, in `TOOLCHAIN_PRE_DETECTION`, so this snippet snippet is actually redundant. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1517889315 From ihse at openjdk.org Fri Mar 8 15:55:54 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 15:55:54 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes Also, I believe that the `HOTSPOT_TOOLCHAIN_TYPE=xlc` quirk actually is bad. This means that the clang functionality in `compilerWarnings_gcc.hpp` (where the `_gcc` is hotspot-speak for "clang or gcc") is being ignored, and it means that globalDefinitions_xlc.hpp is in big parts a direct copy of globalDefinitions_gcc.hpp, but apparently lagging in some fixes that has been made in that file. And it means a lot of lines like this: #if defined(TARGET_COMPILER_gcc) || defined(TARGET_COMPILER_xlc) But cleaning up that is left as an exercise to the AIX team; my goal here just primarily to get rid of the old xlc stuff from the build system. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1985939418 From ihse at openjdk.org Fri Mar 8 16:06:59 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 16:06:59 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. Marked as reviewed by ihse (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18136#pullrequestreview-1925213033 From ihse at openjdk.org Fri Mar 8 16:09:53 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 16:09:53 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. This bug spurred several interesting discussions: 1) Should we store the version string in a text file instead of compiling it into a binary library? 2) Should we store `git describe` information in the version string? I think both are worth discussing (and likely worth implementing, too), but it also seems out of scope for this PR. My main concern here was that tests were looking at the internal string format, but if that is not the case then I see no reason to keep a dummy "unknown" value -- better to get rid of it entirely. So I'd say, just go ahead and integrate this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1985965596 From jwaters at openjdk.org Fri Mar 8 16:19:53 2024 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 8 Mar 2024 16:19:53 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes Build changes look ok, but I'm not an AIX developer ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/18172#pullrequestreview-1925243852 From matsaave at openjdk.org Fri Mar 8 16:51:10 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 8 Mar 2024 16:51:10 GMT Subject: RFR: 8327093: Add truncate function to BitMap API [v4] In-Reply-To: References: Message-ID: > The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. > > This implementation has two core methods: > `copy_of_range` allocates and returns a new word array as used internally by BitMap > `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. > > This change is verified with an included test. Matias Saavedra Silva has updated the pull request incrementally with two additional commits since the last revision: - Fixed CV-qualifier style - Axel comments and new test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18092/files - new: https://git.openjdk.org/jdk/pull/18092/files/17ac26f2..f02e07fe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18092&range=02-03 Stats: 51 lines in 2 files changed: 41 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/18092.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18092/head:pull/18092 PR: https://git.openjdk.org/jdk/pull/18092 From ihse at openjdk.org Fri Mar 8 17:01:54 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 8 Mar 2024 17:01:54 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 13:43:00 GMT, Magnus Ihse Bursie wrote: >> Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Update line number for dereference_null in TestDwarf @JornVernee Are you okay with this solution? No JNI in foreign tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1986062755 From duke at openjdk.org Fri Mar 8 17:17:19 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 8 Mar 2024 17:17:19 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v12] In-Reply-To: References: Message-ID: > The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. > > This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) > > This PR shows upto 19x speedup on buffer sizes of 1MB. Srinivas Vamsi Parasa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: - add missing avx_ifma in amd64.java - Merge branch 'master' of https://github.com/vamsi-parasa/jdk into jdk_poly - update asserts for vpmadd52l/hq - Update description of Poly1305 algo - add cpuinfo test for avx_ifma - fix checks for vpmadd52* - fix use_vl to true for vpmadd52* instrs - fix merge issues with avx_ifma - Merge branch 'master' of https://git.openjdk.java.net/jdk into jdk_poly - removed unused merge, faster and, redundant mov - ... and 17 more: https://git.openjdk.org/jdk/compare/5aae8030...35d39dc5 ------------- Changes: https://git.openjdk.org/jdk/pull/17881/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17881&range=11 Stats: 810 lines in 10 files changed: 800 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/17881.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17881/head:pull/17881 PR: https://git.openjdk.org/jdk/pull/17881 From duke at openjdk.org Fri Mar 8 17:19:58 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 8 Mar 2024 17:19:58 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v11] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 21:40:04 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > update asserts for vpmadd52l/hq Planning to integrate this PR by Monday. Could you please let me know if there are any objections? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17881#issuecomment-1986094404 From jbhateja at openjdk.org Fri Mar 8 17:58:55 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Fri, 8 Mar 2024 17:58:55 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: <_5z5emOe-VqjE7REHmk72wtJ-X_MUggxilrkXFUjdPo=.e30bafc3-0fc4-4872-a99c-f22e383301e3@github.com> References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> <_6dorzq67KAZsTBHBvbQRDi_xW70bFhJudnxbG88m6I=.33e06bd5-d5fc-4ba8-b740-437155d567cf@github.com> <_5z5emOe-VqjE7REHmk72wtJ-X_MUggxilrkXFUjdPo=.e30bafc3-0fc4-4872-a99c-f22e383301e3@github.com> Message-ID: An HTML attachment was scrubbed... URL: From matsaave at openjdk.org Fri Mar 8 18:03:54 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 8 Mar 2024 18:03:54 GMT Subject: RFR: 8327093: Add truncate function to BitMap API [v4] In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From jvernee at openjdk.org Fri Mar 8 18:15:53 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 8 Mar 2024 18:15:53 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From erikj at openjdk.org Fri Mar 8 18:51:54 2024 From: erikj at openjdk.org (Erik Joelsson) Date: Fri, 8 Mar 2024 18:51:54 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes Build changes look good to me, but needs review and verification from the AIX folks. ------------- Marked as reviewed by erikj (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18172#pullrequestreview-1925621925 From duke at openjdk.org Fri Mar 8 18:56:07 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Fri, 8 Mar 2024 18:56:07 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v13] In-Reply-To: References: Message-ID: > The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. > > This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) > > This PR shows upto 19x speedup on buffer sizes of 1MB. Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: make vpmadd52l/hq generic ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17881/files - new: https://git.openjdk.org/jdk/pull/17881/files/35d39dc5..4d3e0ebb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17881&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17881&range=11-12 Stats: 14 lines in 1 file changed: 14 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17881.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17881/head:pull/17881 PR: https://git.openjdk.org/jdk/pull/17881 From sspitsyn at openjdk.org Fri Mar 8 19:49:00 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 8 Mar 2024 19:49:00 GMT Subject: Integrated: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly In-Reply-To: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Wed, 21 Feb 2024 10:28:49 GMT, Serguei Spitsyn wrote: > The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. > With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. > > The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. > > CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly > RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly > > Testing: > - tested with the mach5 tiers 1-6 This pull request has now been integrated. Changeset: 33aa4b26 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/33aa4b26b190927241a81fd1cba6fe262a5b1da0 Stats: 73 lines in 10 files changed: 41 ins; 15 del; 17 mod 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly Reviewed-by: dholmes, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/17944 From daniel.daugherty at oracle.com Fri Mar 8 20:17:07 2024 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Fri, 8 Mar 2024 15:17:07 -0500 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: <62315686-7bbc-4ce4-8b60-7a7d6cea6cfa@oracle.com> Vote: yes Dan On 3/6/24 4:10 PM, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has > rewritten how we store resolution of fields, methods and indy > expressions in the interpreter which will be helpful to the Valhalla > project. He is currently working on enhancements to Class Data Sharing > (CDS) and optimizations in the interpreter to support the Leyden > project. He has contributed over 30 changes [3] to the HotSpot JVM > over the past year, and lower-case-r reviewed many others in the > runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this > nomination.? Votes must be cast in the open by replying to this > mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Fri Mar 8 21:04:17 2024 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 8 Mar 2024 22:04:17 +0100 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: Vote: yes On Wed, Mar 6, 2024 at 10:10?PM wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has rewritten > how we store resolution of fields, methods and indy expressions in the > interpreter which will be helpful to the Valhalla project. He is currently > working on enhancements to Class Data Sharing (CDS) and optimizations in > the interpreter to support the Leyden project. He has contributed over 30 > changes [3] to the HotSpot JVM over the past year, and lower-case-r > reviewed many others in the runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iklam at openjdk.org Sat Mar 9 00:50:17 2024 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 9 Mar 2024 00:50:17 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v5] In-Reply-To: References: Message-ID: <7zoQLBYjtLjHlhlme4yJ5F0zdpTuUSxMjfflSmSglY4=.a2849b60-b21f-4f42-a317-9cda62e16d9b@github.com> > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into 8327138-clean-up-cdsConfig-and-CDS-java - @dholmes-ora comments -- white spaces and copyright - more alignment - fixed alignments - 8327138: Clean up status management in cdsConfig.hpp and CDS.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18095/files - new: https://git.openjdk.org/jdk/pull/18095/files/c8f08f7f..25c19bb4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18095&range=03-04 Stats: 82460 lines in 929 files changed: 7471 ins; 72858 del; 2131 mod Patch: https://git.openjdk.org/jdk/pull/18095.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18095/head:pull/18095 PR: https://git.openjdk.org/jdk/pull/18095 From iklam at openjdk.org Sat Mar 9 03:51:01 2024 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 9 Mar 2024 03:51:01 GMT Subject: RFR: 8327138: Clean up status management in cdsConfig.hpp and CDS.java [v5] In-Reply-To: <6FOFK8n6A5Kqpsg6BbEKTaaqiQgzQKfKflbIopBkD8A=.b58cff05-5f6b-46f7-ab55-1075dad877e7@github.com> References: <17levG82EcbT5zXdxDurs1tNJ07sWRXxtZqBphDPGJE=.acf2e6c0-24fd-418d-ac03-96b2047c4837@github.com> <6FOFK8n6A5Kqpsg6BbEKTaaqiQgzQKfKflbIopBkD8A=.b58cff05-5f6b-46f7-ab55-1075dad877e7@github.com> Message-ID: On Thu, 7 Mar 2024 22:59:38 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'master' into 8327138-clean-up-cdsConfig-and-CDS-java >> - @dholmes-ora comments -- white spaces and copyright >> - more alignment >> - fixed alignments >> - 8327138: Clean up status management in cdsConfig.hpp and CDS.java > > Marked as reviewed by ccheung (Reviewer). Thanks @calvinccheung @matias9927 @dholmes-ora for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18095#issuecomment-1986716664 From iklam at openjdk.org Sat Mar 9 03:51:01 2024 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 9 Mar 2024 03:51:01 GMT Subject: Integrated: 8327138: Clean up status management in cdsConfig.hpp and CDS.java In-Reply-To: References: Message-ID: On Sat, 2 Mar 2024 01:18:06 GMT, Ioi Lam wrote: > A few clean ups: > > 1. Rename functions like "`s_loading_full_module_graph()` to `is_using_full_module_graph()`. The meaning of "loading" is not clear: it might be interpreted as to cover only the period where the artifact is being loaded, but not the period after the artifact is completely loaded. However, the function is meant to cover both periods, so "using" is a more precise term. > > 2. The cumbersome sounding `disable_loading_full_module_graph()` is changed to `stop_using_full_module_graph()`, etc. > > 3. The status of `is_using_optimized_module_handling()` is moved from metaspaceShared.hpp to cdsConfig.hpp, to be consolidated with other types of CDS status. > > 4. The status of CDS was communicated to the Java class `jdk.internal.misc.CDS` by ad-hoc native methods. This is now changed to a single method, `CDS.getCDSConfigStatus()` that returns a bit field. That way we don't need to add a new native method for each type of status. > > 5. `CDS.isDumpingClassList()` was a misnomer. It's changed to `CDS.isLoggingLambdaFormInvokers()`. This pull request has now been integrated. Changeset: 761ed250 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/761ed250ec4b0d92d091a0c316b6d5028986a019 Stats: 219 lines in 19 files changed: 58 ins; 58 del; 103 mod 8327138: Clean up status management in cdsConfig.hpp and CDS.java Reviewed-by: ccheung, matsaave ------------- PR: https://git.openjdk.org/jdk/pull/18095 From jbhateja at openjdk.org Sat Mar 9 07:13:55 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 9 Mar 2024 07:13:55 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v13] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 18:56:07 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > make vpmadd52l/hq generic Marked as reviewed by jbhateja (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17881#pullrequestreview-1926118001 From jbhateja at openjdk.org Sat Mar 9 07:13:55 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sat, 9 Mar 2024 07:13:55 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v10] In-Reply-To: References: <9_AS9fKLjyVQzybg1915BX3xVVFwJRKjsqghjZ99hr0=.13a55d69-fc86-4109-a3be-934d1b8634d0@github.com> <_6dorzq67KAZsTBHBvbQRDi_xW70bFhJudnxbG88m6I=.33e06bd5-d5fc-4ba8-b740-437155d567cf@github.com> <_5z5emOe-VqjE7REHmk72wtJ-X_MUggxilrkXFUjdPo=.e30bafc3-0fc4-4872-a99c-f22e383301e3@github.com> Message-ID: On Fri, 8 Mar 2024 17:56:30 GMT, Jatin Bhateja wrote: > > [poly1305_spr_validation.patch](https://github.com/openjdk/jdk/files/14496404/poly1305_spr_validation.patch) > > Hi @vamsi-parasa , We do not want EVEX to VEX demotions for these newly added instruction on AVX512_IFMA targets since there are no VEX equivalent versions of these instructions, please pick the relevant fixes for assembler routines from my above patch. As @sviswa7 mentioned we should make these instruction generic. Thanks @vamsi-parasa ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1518500977 From andrew at openjdk.org Sat Mar 9 13:32:58 2024 From: andrew at openjdk.org (Andrew John Hughes) Date: Sat, 9 Mar 2024 13:32:58 GMT Subject: RFR: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. Thanks all for the interesting discussion. Integrating this change now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18136#issuecomment-1986857857 From andrew at openjdk.org Sat Mar 9 13:32:58 2024 From: andrew at openjdk.org (Andrew John Hughes) Date: Sat, 9 Mar 2024 13:32:58 GMT Subject: Integrated: 8327389: Remove use of HOTSPOT_BUILD_USER In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:57:27 GMT, Andrew John Hughes wrote: > The HotSpot code has inserted the value of `HOTSPOT_BUILD_USER` into the internal version string since before the JDK was open-sourced, so the reasoning for this is not in the public history. See https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/abstract_vm_version.cpp#L284 > > ~~~ > $ /usr/lib/jvm/java-21-openjdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (21.0.2+13) for linux-amd64 JRE (21.0.2+13), built on 2024-01-19T16:39:52Z by "mockbuild" with gcc 8.5.0 20210514 (Red Hat 8.5.0-20) > ~~~ > > It is not clear what value this provides. By default, the value is set to the system username, but it can be set to a static value using `--with-build-user`. > > In trying to compare builds, this is a source of difference if the builds are produced under different usernames and `--with-build-user` is not set to the same value. > > It may also be a source of information leakage from the host system. It is not clear what value it provides to the end user of the build. > > This patch proposes removing it altogether, but at least knowing why it needs to be there would be progress from where we are now. We could also consider a middle ground of only setting it for "adhoc" builds, those without set version information, as is the case with the overall version output. > > With this patch: > > ~~~ > $ ~/builder/23/images/jdk/bin/java -Xinternalversion > OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > > $ ~/builder/23/images/jdk/bin/java -XX:ErrorHandlerTest=1 > > $ grep '^vm_info' /localhome/andrew/projects/openjdk/upstream/jdk/hs_err_pid1119824.log > vm_info: OpenJDK 64-Bit Server VM (fastdebug 23-internal-adhoc.andrew.jdk) for linux-amd64 JRE (23-internal-adhoc.andrew.jdk), built on 2024-03-06T00:23:37Z with gcc 13.2.1 20230826 > ~~~ > > The above are from a fastdebug build but I also built a product build with no issues. This pull request has now been integrated. Changeset: 243cb098 Author: Andrew John Hughes URL: https://git.openjdk.org/jdk/commit/243cb098d48741e9bd6308ef7609c9a4637a5e07 Stats: 10 lines in 5 files changed: 0 ins; 8 del; 2 mod 8327389: Remove use of HOTSPOT_BUILD_USER Reviewed-by: erikj, ihse ------------- PR: https://git.openjdk.org/jdk/pull/18136 From jpai at openjdk.org Sun Mar 10 10:19:54 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Sun, 10 Mar 2024 10:19:54 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v2] In-Reply-To: References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: On Sun, 10 Mar 2024 08:14:02 GMT, Eirik Bj?rsn?s wrote: >> Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. >> >> These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. >> >> Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. >> >> This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html >> >> Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. >> >> Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. > > Eirik Bj?rsn?s has updated the pull request incrementally with one additional commit since the last revision: > > Use getAndSetReference instead of getAndSetObject in test/hotspot/jtreg/gc/shenandoah/compiler/TestUnsafeLoadStoreMergedHeapStableTests.java Hello Eirik, I've triggered internal CI testing of the current state of this PR to run tier1, tier2 and tier3. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1987173153 From alanb at openjdk.org Sun Mar 10 12:20:53 2024 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 10 Mar 2024 12:20:53 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v2] In-Reply-To: References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: On Sun, 10 Mar 2024 08:24:42 GMT, Eirik Bj?rsn?s wrote: > GHA revealed two call sites for `getAndSetObject` in the test `test/hotspot/jtreg/gc/shenandoah/compiler/TestUnsafeLoadStoreMergedHeapStableTests.java`. > > I have replaced these with the `getAndSetReference`, grepped for any remaining uses without finding anything. Let's see what GHA says. Yes, you'll need to run all tests to make sure that there aren't any others, e.g. test/hotspot/jtreg/compiler/stable/TestUnstableStable.java. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1987207527 From eirbjo at openjdk.org Sun Mar 10 13:47:06 2024 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Sun, 10 Mar 2024 13:47:06 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v3] In-Reply-To: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> > Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. > > These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. > > Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. > > This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html > > Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. > > Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. Eirik Bj?rsn?s has updated the pull request incrementally with two additional commits since the last revision: - Replace calls to Unsafe.putObject with Unsafe.putReference - Update a code comment referencing Unsafe.compareAndSetObject to reference Unsafe.compareAndSetReference instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18176/files - new: https://git.openjdk.org/jdk/pull/18176/files/a333ed30..e199f6b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18176&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18176&range=01-02 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18176.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18176/head:pull/18176 PR: https://git.openjdk.org/jdk/pull/18176 From eirbjo at openjdk.org Sun Mar 10 13:50:54 2024 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Sun, 10 Mar 2024 13:50:54 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v2] In-Reply-To: References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: On Sun, 10 Mar 2024 12:18:02 GMT, Alan Bateman wrote: > Yes, you'll need to run all tests to make sure that there aren't any others, e.g. test/hotspot/jtreg/compiler/stable/TestUnstableStable.java. I've updated `TestUnstableStable` to use `putReference` and also fixed a stray code comment reference to `Unsafe.compareAndSetObject` in `BufferedInputStream`. @jaikiran Expect `TestUnstableStable` above to fail your CI run. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1987238716 From dl at openjdk.org Sun Mar 10 14:21:54 2024 From: dl at openjdk.org (Doug Lea) Date: Sun, 10 Mar 2024 14:21:54 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v3] In-Reply-To: <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> Message-ID: On Sun, 10 Mar 2024 13:47:06 GMT, Eirik Bj?rsn?s wrote: >> Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. >> >> These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. >> >> Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. >> >> This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html >> >> Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. >> >> Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. > > Eirik Bj?rsn?s has updated the pull request incrementally with two additional commits since the last revision: > > - Replace calls to Unsafe.putObject with Unsafe.putReference > - Update a code comment referencing Unsafe.compareAndSetObject to reference Unsafe.compareAndSetReference instead OK with me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1987248843 From jpai at openjdk.org Sun Mar 10 14:40:55 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Sun, 10 Mar 2024 14:40:55 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v2] In-Reply-To: References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: <286GDeeFd02bderE-7wTOh0uKX_VDz7b3uu0Hc-Wqv0=.770a55b9-8d0b-4f59-8c3c-e2e28def620b@github.com> On Sun, 10 Mar 2024 10:17:23 GMT, Jaikiran Pai wrote: >> Eirik Bj?rsn?s has updated the pull request incrementally with one additional commit since the last revision: >> >> Use getAndSetReference instead of getAndSetObject in test/hotspot/jtreg/gc/shenandoah/compiler/TestUnsafeLoadStoreMergedHeapStableTests.java > > Hello Eirik, I've triggered internal CI testing of the current state of this PR to run tier1, tier2 and tier3. > @jaikiran Expect TestUnstableStable above to fail your CI run. Right, that's the only test that failed (with a compilation error) on all platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1987254844 From david.holmes at oracle.com Mon Mar 11 01:23:41 2024 From: david.holmes at oracle.com (David Holmes) Date: Mon, 11 Mar 2024 11:23:41 +1000 Subject: CFV: New JDK Reviewer: Matias Saveedra Silva In-Reply-To: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> References: <17ab75c7-f8a6-4b93-9388-272e0c33ba79@oracle.com> Message-ID: <3a2ceb4c-8be6-49a2-b8be-11b89e2a06c0@oracle.com> Vote: yes David On 7/03/2024 7:10 am, coleen.phillimore at oracle.com wrote: > I hereby nominate Matias Saveedra Silva (matsaave) to JDK Reviewer. > > Matias is a member of the HotSpot runtime team at Oracle. He has > rewritten how we store resolution of fields, methods and indy > expressions in the interpreter which will be helpful to the Valhalla > project. He is currently working on enhancements to Class Data Sharing > (CDS) and optimizations in the interpreter to support the Leyden > project. He has contributed over 30 changes [3] to the HotSpot JVM over > the past year, and lower-case-r reviewed many others in the runtime area. > > Votes are due by March 20, 2024. > > Only current JDK Reviewers [1] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [2]. > > Coleen Phillimore > > [1] https://openjdk.org/census > [2] https://openjdk.org/projects/#reviewer-vote > [3] https://github.com/openjdk/jdk/commits?author=matsaave at openjdk.org From dholmes at openjdk.org Mon Mar 11 02:24:58 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Mar 2024 02:24:58 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 06:11:23 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: > > - Merge > - review: minor tweak in test description of ObjectMonitorUsage.java > - review: addressed more comments on the fix and new test > - rename after merge: jvmti_common.h to jvmti_common.hpp > - Merge > - review: update comment in threads.hpp > - fix deadlock with carrier threads starvation in ObjectMonitorUsage test > - resolve merge conflict for deleted file objmonusage003.cpp > - fix a typo in libObjectMonitorUsage.cpp > - fix potential sync gap in the test ObjectMonitorUsage > - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 Looks good. Thanks. Sorry it has taken me so long to work through this one. It seemed so simple when I filed the bug :) ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17680#pullrequestreview-1926797746 From dholmes at openjdk.org Mon Mar 11 02:24:58 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Mar 2024 02:24:58 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v22] In-Reply-To: <5lsZBFSbWmDQVJe9tGr2Rll00SWkbU7HBuRpEC_-Duo=.167e62b6-ca47-4a93-b8ad-7c22b379221d@github.com> References: <4XAkV4CDJUTXFeKCCkAQlD_HntnLWwd_AqcTEMowS4k=.f5d2572f-3061-4a56-bf9d-e97bda57e3a0@github.com> <5lsZBFSbWmDQVJe9tGr2Rll00SWkbU7HBuRpEC_-Duo=.167e62b6-ca47-4a93-b8ad-7c22b379221d@github.com> Message-ID: <_uXZSNczXm2ODdcowDzJZleDqNWUL6o97XAX10pWnBA=.9641d2c4-2738-422a-b882-0c3c9b408ad1@github.com> On Tue, 5 Mar 2024 09:04:08 GMT, Serguei Spitsyn wrote: >> The loop is endless without this extra condition, so we are getting a test execution timeout. >> The `waiters` seems to be `circular doubly linked list` as we can see below: >> >> inline void ObjectMonitor::AddWaiter(ObjectWaiter* node) { >> assert(node != nullptr, "should not add null node"); >> assert(node->_prev == nullptr, "node already in list"); >> assert(node->_next == nullptr, "node already in list"); >> // put node at end of queue (circular doubly linked list) >> if (_WaitSet == nullptr) { >> _WaitSet = node; >> node->_prev = node; >> node->_next = node; >> } else { >> ObjectWaiter* head = _WaitSet; >> ObjectWaiter* tail = head->_prev; >> assert(tail->_next == head, "invariant check"); >> tail->_next = node; >> head->_prev = node; >> node->_next = head; >> node->_prev = tail; >> } >> } >> >> I'll make sure nothing is missed and check the old code again. > > Okay. Now I see why the old code did not have endless loops. Apologies I should have checked the code - I'd forgotten the list is circular to avoid needing an explicit tail pointer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1519058800 From dholmes at openjdk.org Mon Mar 11 02:24:59 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Mar 2024 02:24:59 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 10:09:10 GMT, Serguei Spitsyn wrote: >> test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 319: >> >>> 317: try { >>> 318: ready = true; >>> 319: lockCheck.wait(); >> >> Spurious wakeups will break this test. They shouldn't happen but ... > > I have an idea how to fix it but it will add extra complexity. Not sure it is worth it. > It is possible to identify there was a spurious wakeup and invalidate the sub-test result in such a case. > Not sure if it is important to rerun the sub-test then. > What do you think? Sorry I missed this response. I can't see a way to address spurious wakeups in this case as it needs to be a per-thread flag (so that each thread knows it was notified) but you don't know which thread will be notified in any given call to `notify()`. I also can't see how you can detect a spurious wakeup in this code. If they happen then a subtest may fail due to an unexpected number of re-entering threads. I think we will just have to see how stable the test is in practice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1519071056 From dholmes at openjdk.org Mon Mar 11 02:39:55 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Mar 2024 02:39:55 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 13:43:00 GMT, Magnus Ihse Bursie wrote: >> Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Update line number for dereference_null in TestDwarf Looks like a nice cleanup - great to see all the duplicated EXPORT code gone! test/hotspot/jtreg/runtime/ErrorHandling/libTestDwarfHelper.h line 24: > 22: */ > 23: > 24: #include Seems unneeded. test/hotspot/jtreg/runtime/ErrorHandling/libTestDwarfHelper.h line 27: > 25: > 26: #include "export.h" > 27: #include "jni.h" Seems unneeded. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18135#pullrequestreview-1926822024 PR Review Comment: https://git.openjdk.org/jdk/pull/18135#discussion_r1519075894 PR Review Comment: https://git.openjdk.org/jdk/pull/18135#discussion_r1519076039 From aboldtch at openjdk.org Mon Mar 11 06:47:55 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 11 Mar 2024 06:47:55 GMT Subject: RFR: 8327093: Add truncate function to BitMap API [v4] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 16:51:10 GMT, Matias Saavedra Silva wrote: >> The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. >> >> This implementation has two core methods: >> `copy_of_range` allocates and returns a new word array as used internally by BitMap >> `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. >> >> This change is verified with an included test. > > Matias Saavedra Silva has updated the pull request incrementally with two additional commits since the last revision: > > - Fixed CV-qualifier style > - Axel comments and new test case lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18092#pullrequestreview-1927027504 From alanb at openjdk.org Mon Mar 11 07:00:56 2024 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 11 Mar 2024 07:00:56 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code In-Reply-To: References: Message-ID: <_YFd8mgBis8mgRwfpafttLS2y1_-Aiy5aIdLr2eCXv8=.4890fb5d-cc8a-4cd8-a53d-3f18aa1dfbe6@github.com> On Wed, 6 Mar 2024 12:14:10 GMT, Magnus Ihse Bursie wrote: > * `test/jdk/java/lang/Thread/jni/AttachCurrentThread/libImplicitAttach.c` was missing an export. This had not been discovered before since that file was not compiled on Windows. It's a Linux/macOS specific test so it wasn't needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1987743188 From alanb at openjdk.org Mon Mar 11 07:00:56 2024 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 11 Mar 2024 07:00:56 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: <-9V3vdX_4NWhrgoEYdqy9W-PyT8HBwjKyMwF-IzM9Uo=.c5f3e91a-017e-42c1-abc3-ab90c94f3b75@github.com> On Wed, 6 Mar 2024 13:43:00 GMT, Magnus Ihse Bursie wrote: >> Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Update line number for dereference_null in TestDwarf Good cleanup. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18135#pullrequestreview-1927041103 From mbaesken at openjdk.org Mon Mar 11 07:53:53 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 11 Mar 2024 07:53:53 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes Hi, thanks for doing this cleanup change. I put it into our build/test queue to see the results on AIX. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1987807579 From jkern at openjdk.org Mon Mar 11 09:23:57 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 11 Mar 2024 09:23:57 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes Changes requested by jkern (Author). doc/building.html line 679: > 677:

IBM Open XL C/C++

> 678:

The minimum accepted version of Open XL is 17.1.1.4. This is in > 679: essence clang 13, and will be treated as such by the OpenJDK build It is clang 15 not 13. Clang 13 was in 17.1.0 doc/building.md line 493: > 491: > 492: The minimum accepted version of Open XL is 17.1.1.4. This is in essence clang > 493: 13, and will be treated as such by the OpenJDK build system. clang 15 not 13 make/autoconf/toolchain.m4 line 285: > 283: XLC_TEST_PATH=${TOOLCHAIN_PATH}/ > 284: fi > 285: if test "x$TOOLCHAIN_TYPE" = xclang; then Why did you also remove the test for the clang compiler? Ah I see, you moved the clang compiler check down below make/autoconf/toolchain.m4 line 409: > 407: # Target: powerpc-ibm-aix7.2.0.0 > 408: # Thread model: posix > 409: # InstalledDir: /opt/IBM/openxlC/17.1.0/bin Please correct: IBM Open XL C/C++ for AIX 17.1.**1** (xxxxxxxxxxxxxxx), clang version 1**5**.0.0 # Target: **powerpc-ibm-aix7.2.**5**.7** # Thread model: posix # InstalledDir: /opt/IBM/openxlC/17.1.**1**/bin ------------- PR Review: https://git.openjdk.org/jdk/pull/18172#pullrequestreview-1927198641 PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519321337 PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519321980 PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519358938 PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519365934 From jkern at openjdk.org Mon Mar 11 09:23:58 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 11 Mar 2024 09:23:58 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:31:18 GMT, Magnus Ihse Bursie wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert SEARCH_PATH changes > > make/autoconf/flags-cflags.m4 line 687: > >> 685: PICFLAG="-fPIC" >> 686: PIEFLAG="-fPIE" >> 687: elif test "x$TOOLCHAIN_TYPE" = xclang && test "x$OPENJDK_TARGET_OS" = xaix; then > > Just a remark: This code has never been executed, since running with clang on any OS would hit the branch above. Also, the code is syntactically incorrect, missing a trailing `"`. OK this was a flaw in my introduction of clang toolchain for AIX. The intention was to keep the xlc options in form of their clang counterparts. I will try with a corrected version for clang on AIX and will come back to you. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519347031 From jwaters at openjdk.org Mon Mar 11 09:32:58 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 11 Mar 2024 09:32:58 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 08:38:53 GMT, Joachim Kern wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert SEARCH_PATH changes > > doc/building.html line 679: > >> 677:

IBM Open XL C/C++

>> 678:

The minimum accepted version of Open XL is 17.1.1.4. This is in >> 679: essence clang 13, and will be treated as such by the OpenJDK build > > It is clang 15 not 13. Clang 13 was in 17.1.0 Is Open XL C/C++ considered a compiler or more akin to a development environment like Xcode is for macOS? Depending on which, we could just say clang is the compiler for AIX without needing to say that Open XL is treated like clang, etc Also, why did this remove the link to the Supported Build Platforms page? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519394251 From sspitsyn at openjdk.org Mon Mar 11 10:26:59 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 11 Mar 2024 10:26:59 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: <72l1Ns-4YP_MMpGTjK-jTuakSs2-G4HuntRfrgc3Ls0=.85d5c80c-8db4-4d7d-a798-f7050cddfa2c@github.com> On Fri, 8 Mar 2024 06:11:23 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: > > - Merge > - review: minor tweak in test description of ObjectMonitorUsage.java > - review: addressed more comments on the fix and new test > - rename after merge: jvmti_common.h to jvmti_common.hpp > - Merge > - review: update comment in threads.hpp > - fix deadlock with carrier threads starvation in ObjectMonitorUsage test > - resolve merge conflict for deleted file objmonusage003.cpp > - fix a typo in libObjectMonitorUsage.cpp > - fix potential sync gap in the test ObjectMonitorUsage > - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 > Sorry I missed this response. I can't see a way to address spurious wakeups in this case as it needs to be a per-thread flag (so that each thread knows it was notified) but you don't know which thread will be notified in any given call to notify(). I also can't see how you can detect a spurious wakeup in this code. If they happen then a subtest may fail due to an unexpected number of re-entering threads. I think we will just have to see how stable the test is in practice. Okay, thanks! Let's see if we ever encounter any spurious wakeup in this test. Than you a lot for thorough review, David! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17680#issuecomment-1988074775 PR Comment: https://git.openjdk.org/jdk/pull/17680#issuecomment-1988076636 From jkern at openjdk.org Mon Mar 11 12:29:55 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 11 Mar 2024 12:29:55 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: <0yRLC1SM9um8MIZYJP6sf8OOMeOCbzCc1LF_JQLxZHg=.e3391f4e-ab9e-423c-880c-e84aba29e87b@github.com> On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes make/autoconf/toolchain.m4 line 940: > 938: if test "x$OPENJDK_TARGET_OS" = xaix; then > 939: # Make sure we have the Open XL version of clang on AIX > 940: $ECHO "$CC_VERSION_OUTPUT" | $GREP "IBM Open XL C/C++ for AIX" > /dev/null This does not work since $CC_VERSION_OUTPUT is unset. We need CC_VERSION_OUTPUT=`${XLC_TEST_PATH}ibm-clang++_r --version 2>&1 | $HEAD -n 1` before, as in the previous code some lines above which you removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519634810 From mbaesken at openjdk.org Mon Mar 11 12:47:57 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 11 Mar 2024 12:47:57 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes With this change added, currently configure fails checking for ibm-llvm-cxxfilt... /opt/IBM/openxlC/17.1.1/tools/ibm-llvm-cxxfilt configure: error: ibm-clang_r version output check failed, output: configure exiting with result code maybe related to what Joachim pointed out ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1988358518 From jkern at openjdk.org Mon Mar 11 13:48:56 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 11 Mar 2024 13:48:56 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:48:08 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Revert SEARCH_PATH changes Changes requested by jkern (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/18172#pullrequestreview-1927875301 From jkern at openjdk.org Mon Mar 11 13:48:56 2024 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 11 Mar 2024 13:48:56 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 08:59:03 GMT, Joachim Kern wrote: >> make/autoconf/flags-cflags.m4 line 687: >> >>> 685: PICFLAG="-fPIC" >>> 686: PIEFLAG="-fPIE" >>> 687: elif test "x$TOOLCHAIN_TYPE" = xclang && test "x$OPENJDK_TARGET_OS" = xaix; then >> >> Just a remark: This code has never been executed, since running with clang on any OS would hit the branch above. Also, the code is syntactically incorrect, missing a trailing `"`. > > OK this was a flaw in my introduction of clang toolchain for AIX. The intention was to keep the xlc options in form of their clang counterparts. I will try with a corrected version for clang on AIX and will come back to you. OK, the `-Wl,-bbigtoc` is not a compiler option but a linker option and it is already set in the linker options. But the `-fpic -mcmodel=large` should be set to avoid creating a jump to out-of-order code. So we can replace the JVM_PICFLAG="$PICFLAG" JDK_PICFLAG="$PICFLAG" code some lines below by if test "x$TOOLCHAIN_TYPE" = xclang && test "x$OPENJDK_TARGET_OS" = xaix; then JVM_PICFLAG="-fpic -mcmodel=large" else JVM_PICFLAG="$PICFLAG" fi JDK_PICFLAG="$PICFLAG" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1519747481 From eirbjo at openjdk.org Mon Mar 11 16:32:19 2024 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Mon, 11 Mar 2024 16:32:19 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v2] In-Reply-To: References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: <2QMf5NUmk6LlLOdbIpIQdBJqWH7jYfqafG9zZPSZVTM=.45e54122-25b0-4df5-9ec8-2e6f7c71709b@github.com> On Sun, 10 Mar 2024 13:47:43 GMT, Eirik Bj?rsn?s wrote: > Yes, you'll need to run all tests to make sure that there aren't any others, e.g. test/hotspot/jtreg/compiler/stable/TestUnstableStable.java. Alan, With @jaikiran running Oracle CI tier1, tier2, tier3 and observing only the now-fixed `TestUnstableStable.java` test fail, would you consider re-approving the current state of this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1988559676 From matsaave at openjdk.org Mon Mar 11 16:32:20 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 11 Mar 2024 16:32:20 GMT Subject: RFR: 8327093: Add truncate function to BitMap API [v3] In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 03:55:14 GMT, Ioi Lam wrote: >> LGTM > >> LGTM > > BTW, the title of the PR should be changed to drop the word "slice", as it's no used in the code at all. Also, we are not returning a copy of a bitmap, but rather truncating the contents of the current bitmap. Thanks for the reviews @iklam and @xmas92! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18092#issuecomment-1988576234 From stefank at openjdk.org Mon Mar 11 16:32:25 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 11 Mar 2024 16:32:25 GMT Subject: RFR: JDK-8326974: ODR violation in macroAssembler_aarch64.cpp [v3] In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 17:16:04 GMT, Andrew Haley wrote: >> This is a slightly different patch from the one suggested by the bug reporter. It doesn't make any sense to export those classes globally, so I've given them internal linkage. > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8326974: ODR violation in macroAssembler_aarch64.cpp > Egad! the action failure messages on Windows include this: > > Linking jvm.dll jvm.exp : error LNK2001: unresolved external symbol "const `anonymous namespace'::Decoder::`vftable'" (??_7Decoder@?A0xe77b3496@@6b@) jvm.exp : error LNK2001: unresolved external symbol "const `anonymous namespace'::Patcher::`vftable'" (??_7Patcher@?A0xe77b3496@@6b@) > > So, clearly something in the guts of the (Windows) build does not like anonymous name spaces. FWIW, the make files strip these symbols on Windows (for some reason). I've hit a similar problem when we started to use lambdas in hotspot: https://github.com/openjdk/jdk/commit/09f5235c65de546640d5f923fa9369e28643c6ed ------------- PR Comment: https://git.openjdk.org/jdk/pull/18056#issuecomment-1988578345 From matsaave at openjdk.org Mon Mar 11 16:38:03 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 11 Mar 2024 16:38:03 GMT Subject: Integrated: 8327093: Add truncate function to BitMap API In-Reply-To: References: Message-ID: On Fri, 1 Mar 2024 23:14:18 GMT, Matias Saavedra Silva wrote: > The BitMap does not currently have a "slice" method which can return a subset of the bitmap defined by a start and end bit. This utility can be used to analyze portions of a bitmap or to overwrite the existing map with a shorter, modified map. > > This implementation has two core methods: > `copy_of_range` allocates and returns a new word array as used internally by BitMap > `truncate` overwrites the internal array with the one generated by `copy_of_range` so none of the internal workings are exposed to callers outside of BitMap. > > This change is verified with an included test. This pull request has now been integrated. Changeset: d74b907d Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/d74b907d206073243437771486c1d62240de3d81 Stats: 289 lines in 3 files changed: 287 ins; 0 del; 2 mod 8327093: Add truncate function to BitMap API Reviewed-by: aboldtch, iklam ------------- PR: https://git.openjdk.org/jdk/pull/18092 From duke at openjdk.org Mon Mar 11 16:45:03 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Mon, 11 Mar 2024 16:45:03 GMT Subject: Integrated: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions In-Reply-To: References: Message-ID: On Thu, 15 Feb 2024 18:42:49 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. > > This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) > > This PR shows upto 19x speedup on buffer sizes of 1MB. This pull request has now been integrated. Changeset: 18de9321 Author: vamsi-parasa Committer: Sandhya Viswanathan URL: https://git.openjdk.org/jdk/commit/18de9321ce8722f244594b1ed3b62cd1421a7994 Stats: 824 lines in 10 files changed: 814 ins; 0 del; 10 mod 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions Reviewed-by: sviswanathan, jbhateja ------------- PR: https://git.openjdk.org/jdk/pull/17881 From dcubed at openjdk.org Mon Mar 11 17:40:02 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 11 Mar 2024 17:40:02 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 06:11:23 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: > > - Merge > - review: minor tweak in test description of ObjectMonitorUsage.java > - review: addressed more comments on the fix and new test > - rename after merge: jvmti_common.h to jvmti_common.hpp > - Merge > - review: update comment in threads.hpp > - fix deadlock with carrier threads starvation in ObjectMonitorUsage test > - resolve merge conflict for deleted file objmonusage003.cpp > - fix a typo in libObjectMonitorUsage.cpp > - fix potential sync gap in the test ObjectMonitorUsage > - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 I last reviewed v02 of this PR so I've jumped forward to the full v22 webrev. It may take me a bit of time to redo this review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17680#issuecomment-1989046765 From sgehwolf at openjdk.org Mon Mar 11 17:45:06 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 11 Mar 2024 17:45:06 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container Message-ID: Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: java -XshowSettings:system --version Operating System Metrics: Provider: cgroupv1 System not containerized. openjdk 23-internal 2024-09-17 OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. Testing: - [x] GHA (risc-v failure seems infra related) - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) - [x] Some manual testing using cri-o Thoughts? ------------- Commit messages: - jcheck fixes - Fix tests - Implement Metrics.isContainerized() - Some clean-up - Drop cgroups testing on plain Linux - Implement fall-back logic for non-ro controller mounts - Make find_ro static and local to compilation unit - 8261242: [Linux] OSContainer::is_containerized() returns true Changes: https://git.openjdk.org/jdk/pull/18201/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18201&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8261242 Stats: 360 lines in 20 files changed: 258 ins; 78 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/18201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18201/head:pull/18201 PR: https://git.openjdk.org/jdk/pull/18201 From sspitsyn at openjdk.org Mon Mar 11 18:10:58 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 11 Mar 2024 18:10:58 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 02:18:57 GMT, David Holmes wrote: >> I have an idea how to fix it but it will add extra complexity. Not sure it is worth it. >> It is possible to identify there was a spurious wakeup and invalidate the sub-test result in such a case. >> Not sure if it is important to rerun the sub-test then. >> What do you think? > > Sorry I missed this response. I can't see a way to address spurious wakeups in this case as it needs to be a per-thread flag (so that each thread knows it was notified) but you don't know which thread will be notified in any given call to `notify()`. I also can't see how you can detect a spurious wakeup in this code. If they happen then a subtest may fail due to an unexpected number of re-entering threads. > I think we will just have to see how stable the test is in practice. Okay, thanks! I'll add some diagnostic code to catch spurious wakups. Let's see if we ever encounter any spurious wakeup in this test. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520204046 From dcubed at openjdk.org Mon Mar 11 18:56:28 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 11 Mar 2024 18:56:28 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 06:11:23 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: > > - Merge > - review: minor tweak in test description of ObjectMonitorUsage.java > - review: addressed more comments on the fix and new test > - rename after merge: jvmti_common.h to jvmti_common.hpp > - Merge > - review: update comment in threads.hpp > - fix deadlock with carrier threads starvation in ObjectMonitorUsage test > - resolve merge conflict for deleted file objmonusage003.cpp > - fix a typo in libObjectMonitorUsage.cpp > - fix potential sync gap in the test ObjectMonitorUsage > - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 Thumbs up on the v22 version of this change. Of course, I'm assuming that the updated tests pass and that there are no new failures in the usual Mach5 testing. Sorry for the delay in getting back to this review. A combination of vacation following by illness delayed me getting back to this PR. Please note that I did not try to compare the (deleted) objmonusage003 version of the test with the (new) ObjectMonitorUsage version. I didn't think that would be useful. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1483: > 1481: markWord mark = hobj->mark(); > 1482: ResourceMark rm(current_thread); > 1483: GrowableArray* wantList = nullptr; Thanks for refactoring `JvmtiEnvBase::get_object_monitor_usage()` to be much more clear. It was difficult to review in GitHub so I switched to the webrev frames view so that I could scroll the two sides back and forth and determine equivalence. src/hotspot/share/runtime/threads.cpp line 1186: > 1184: > 1185: #if INCLUDE_JVMTI > 1186: // Get count Java threads that are waiting to enter or re-enter the specified monitor. nit typo: s/count Java/count of Java/ src/hotspot/share/runtime/threads.hpp line 134: > 132: static unsigned print_threads_compiling(outputStream* st, char* buf, int buflen, bool short_form = false); > 133: > 134: // Get count Java threads that are waiting to enter or re-enter the specified monitor. nit typo: s/count Java/count of Java/ Also this file needs a copyright year update. test/hotspot/jtreg/TEST.quick-groups line 972: > 970: vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage001/TestDescription.java \ > 971: vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage002/TestDescription.java \ > 972: vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage003/TestDescription.java \ This file needs a copyright year update. test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 79: > 77: waits_to_be_notified--; > 78: } > 79: } I like these for getting rid of the raciness of knowing when a thread is blocked on contention, finished entering, blocked on wait or finished waiting. Nice! test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 202: > 200: LOG("(%d) notify_waiter_count expected: %d, actually: %d\n", > 201: check_idx, notifyWaiterCount, inf.notify_waiter_count); > 202: result = STATUS_FAILED; Please consider prefixing these LOG output lines with "FAILED:" so that it is easier to identify the log output where failure info is shown. test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 215: > 213: if (tested_monitor != nullptr) { > 214: jni->DeleteGlobalRef(tested_monitor); > 215: } Since this is at the beginning of `setTestedMonitor` does it mean that we "leak" the last global ref? test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage001/TestDescription.java line 46: > 44: * @library /vmTestbase > 45: * /test/lib > 46: * @run main/othervm/native -agentlib:objmonusage001=printdump nsk.jvmti.GetObjectMonitorUsage.objmonusage001 I only included this `=printdump` for debug when I was updating the test. You don't have to keep it. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17680#pullrequestreview-1928679264 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520200433 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520203806 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520208119 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520213205 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520271891 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520264870 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520266194 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520235253 From dcubed at openjdk.org Mon Mar 11 18:56:29 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 11 Mar 2024 18:56:29 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 18:08:14 GMT, Serguei Spitsyn wrote: >> Sorry I missed this response. I can't see a way to address spurious wakeups in this case as it needs to be a per-thread flag (so that each thread knows it was notified) but you don't know which thread will be notified in any given call to `notify()`. I also can't see how you can detect a spurious wakeup in this code. If they happen then a subtest may fail due to an unexpected number of re-entering threads. >> I think we will just have to see how stable the test is in practice. > > Okay, thanks! I'll add some diagnostic code to catch spurious wakups. > Let's see if we ever encounter any spurious wakeup in this test. I think some sort of comment needs to be added here to document the possibility of this code path being affect by a spurious wakeup. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520260059 From coleenp at openjdk.org Mon Mar 11 19:15:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 11 Mar 2024 19:15:30 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v2] In-Reply-To: References: Message-ID: <7ArS4nV6fAMDwqKUQt3yhd3vM3b0zvXWZJAZB3XQnQs=.dcf22bac-ccf2-4c56-8f44-7d722ba5c241@github.com> On Tue, 5 Mar 2024 13:07:02 GMT, Fredrik Bredberg wrote: >> The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: >> >> >> [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 >> [6.693s][trace][continuations] local 0 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 >> [6.693s][trace][continuations] - 3 locals 3 max stack >> [6.693s][trace][continuations] - codelet: return entry points >> [6.693s][trace][continuations] saved fp >> [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address >> [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 >> [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp >> [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) >> [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method >> [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp >> [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) >> [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache >> [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) >> [6.693s][trace][continuatio... > > Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8314508_relativized_ptr_frame_describe > - Small white space fix > - 8314508: Improve how relativized pointers are printed by frame::describe Ok, this looks good. Thank you for the explanation. I had a couple of comment ideas that might make it make sense without reading the whole file. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18102#pullrequestreview-1928895074 From coleenp at openjdk.org Mon Mar 11 19:15:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 11 Mar 2024 19:15:30 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v2] In-Reply-To: <53LYpOs2rFf5LkZTXG_u8boy499KyYfbxmzRrl4iLdE=.b0eb2b64-9308-460f-8379-9b42b399ab58@github.com> References: <53LYpOs2rFf5LkZTXG_u8boy499KyYfbxmzRrl4iLdE=.b0eb2b64-9308-460f-8379-9b42b399ab58@github.com> Message-ID: On Fri, 8 Mar 2024 14:15:03 GMT, Fredrik Bredberg wrote: >> src/hotspot/share/runtime/frame.cpp line 1634: >> >>> 1632: st->print_cr(" %s %s %s", spacer, spacer, fv.description); >>> 1633: } else { >>> 1634: if (*fv.description == '#' && isdigit(fv.description[1])) { >> >> Is it possible for fv.description to be null? its sort of unclear from the context. There are several pointer dereferences here. >> >> Can you add a comment about what the format of this is? What is fv.description supposed to be? # >> >> And fp.location is the relativized offset, if fp.description is NOT #? >> >> This really does need some comment. And maybe a local variable to say what *fv.location is. >> >> And after testing the # output, the code still has to see if fv.location is between -100 and 100? Should it not just check fp != nullptr ? > > @coleenp > > I'll try to answer as good as I can. > > **Q1:** Is it possible for `fv.description` to be null? its sort of unclear from the context. There are several pointer dereferences here. > **A1:** If `fv.description` ever had been null (before any of my changes) it would have lead to a crash, so in my world it is never null. > > **Q2:** Can you add a comment about what the format of this is? What is `fv.description` supposed to be? # > **A2:** If you look at the stack frame in the description of this PR you'll see the contents of the `fv.description` after the `

: ` columns. > The only `fv.description` string starting with a '#' is the line for the saved frame pointer. So, `#10 method java.lang.invoke.LambdaForm...` basicaly means frame 10. > This is also how I find the frame pointer. > > **Q3:** This really does need some comment. And maybe a local variable to say what `*fv.location` is. > **A3:** `fv.location` contains the `
:` columns value and `*fv.location` contains the `` columns value. > > However if the value at an address is relativized like 0x000000409e14cc20 in the example, the real content of address 0x000000409e14cc20 is 0x0000000000000003. So instead of showing the contents as 3 (which we used to do before my patch) we show the real derelativized pointer value which is 0x000000409e14cc88 and also add the text `(relativized: fp+3)` to try to explain to anyone that's confused by the fact that the real content of the stack address is 3. > > **Q4:** And after testing the # output, the code still has to see if fv.location is between -100 and 100? Should it not just check `fp != nullptr` ? > **A4:** No. In order for us to believe that `*fv.location` is a fp-relative value it must be a reasonable small positive/negative number. Otherwise we would print all large pointer values as fp-relative. > > The oddball situation is that I've found frame pointer addresses pointing to a content which is within -100 and 100, with a `fv.description` string starting with a '#'. And those a not relativized values, and therefore should not be printed as such. Hence the `*fv.description != '#'` condition. > > Hope this shed some light on this code. Ok, your answers make sense studying the code and the output together. None of these values are null and I don't think adding a null check is helpful now. [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) Maybe adding a comment before this line like: // The fv.description string starting with a '#' is the line for the saved frame pointer eg. // So, #10 method java.lang.invoke.LambdaForm... basicaly means frame 10. Then say later before line 1638, what you said in this comment. // Some frame pointer addresses pointing to a content which is within -100 and 100, with a fv.description // string starting with a '#'. And those a not relativized values, and therefore should not be printed as such. Then reading the code again, you'll or someone else will have an idea of why. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18102#discussion_r1520297192 From dcubed at openjdk.org Mon Mar 11 19:36:20 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 11 Mar 2024 19:36:20 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v8] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: On Fri, 8 Mar 2024 06:05:09 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. >> With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. >> >> The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. >> >> CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly >> RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly >> >> Testing: >> - tested with the mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge > - review: updated test desciption for two modified tests > - review: added new internal function JvmtiEnvBase::get_thread_or_vthread_state > - review: replaced timed with timed-out in JDWP and JDI spec > - Merge > - Merge > - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method > - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly Thumbs up. I'm not sure how I missed reviewing this change so I did a post-integration review. ------------- PR Review: https://git.openjdk.org/jdk/pull/17944#pullrequestreview-1928978384 From mchung at openjdk.org Mon Mar 11 19:37:20 2024 From: mchung at openjdk.org (Mandy Chung) Date: Mon, 11 Mar 2024 19:37:20 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v3] In-Reply-To: <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> Message-ID: On Sun, 10 Mar 2024 13:47:06 GMT, Eirik Bj?rsn?s wrote: >> Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. >> >> These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. >> >> Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. >> >> This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html >> >> Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. >> >> Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. > > Eirik Bj?rsn?s has updated the pull request incrementally with two additional commits since the last revision: > > - Replace calls to Unsafe.putObject with Unsafe.putReference > - Update a code comment referencing Unsafe.compareAndSetObject to reference Unsafe.compareAndSetReference instead This should wait for @jaikiran approval confirming the test result is good. ------------- Marked as reviewed by mchung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18176#pullrequestreview-1928982510 From dcubed at openjdk.org Mon Mar 11 19:39:23 2024 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Mon, 11 Mar 2024 19:39:23 GMT Subject: RFR: 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly [v8] In-Reply-To: References: <-Yf2Qj_kbyZAA-LDK96IIOcHWMh8__QT1XfnD9jR8kU=.507d5503-1544-44fb-9e81-438ba3458dcd@github.com> Message-ID: <0U6ldrYqLu2-H3Lzx35FxXgY9fb7FH3GqFts2bVFWeo=.a9f4488b-47a0-4194-bf76-c917571d6b6b@github.com> On Fri, 8 Mar 2024 06:05:09 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetCurrentContendedMonitor()` does not match the spec. It can sometimes return an incorrect information about the contended monitor. Such a behavior does not match the function spec. >> With this update the `GetCurrentContendedMonitor()` is returning the monitor only when the specified thread is waiting to enter or re-enter the monitor, and the monitor is not returned when the specified thread is waiting in the `java.lang.Object.wait` to be notified. >> >> The implementations of the JDWP `ThreadReference.CurrentContendedMonitor` command and JDI `ThreadReference.currentContendedMonitor()` method are based and depend on this JVMTI function. The JDWP command and the JDI method were specified incorrectly and had incorrect behavior. The fix slightly corrects both the JDWP and JDI specs. The JDWP and JDI implementations have been also fixed because they use this JVM TI update. Please, see and review the related CSR and Release-Note. >> >> CSR: [8326024](https://bugs.openjdk.org/browse/JDK-8326024): JVM TI GetCurrentContendedMonitor is implemented incorrectly >> RN: [8326038](https://bugs.openjdk.org/browse/JDK-8326038): Release Note: JVM TI GetCurrentContendedMonitor is implemented incorrectly >> >> Testing: >> - tested with the mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge > - review: updated test desciption for two modified tests > - review: added new internal function JvmtiEnvBase::get_thread_or_vthread_state > - review: replaced timed with timed-out in JDWP and JDI spec > - Merge > - Merge > - fix specs: JDWP ThreadReference.CurrentContendedMonitor command, JDI ThreadReference.currentContendedMonitor method > - 8256314: JVM TI GetCurrentContendedMonitor is implemented incorrectly I looked at the commit and it appears that not all copyright years got fixed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17944#issuecomment-1989283515 From matsaave at openjdk.org Mon Mar 11 19:43:23 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 11 Mar 2024 19:43:23 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v5] In-Reply-To: References: Message-ID: > We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time > > This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. > > Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. > > Metrics, courtesy of @iklam : > ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 > Oopmap = 15872 bytes > > calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 > Oopmap = 10496 bytes > > (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 > More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. > > $ java -Xshare:dump -Xlog:cds+map | grep lead > [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 > [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 > > So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. > > Current output: > $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead > [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 > [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 37 commits: - Merge branch 'master' into reorder_archive_heap_8251330 - Merge branch 'master' into reorder_archive_heap_8251330 - Moved slice method to new PR - Updated slice and truncate to be bit aligned - Removed unused methods and reordered code - Merge branch 'master' into reorder_archive_heap_8251330 - Slice method now copies and Ioi Comments - Renamed variables and moved pointer patching shift - Added helper function - Merge branch 'master' into reorder_archive_heap_8251330 - ... and 27 more: https://git.openjdk.org/jdk/compare/c65d3089...a09cd64a ------------- Changes: https://git.openjdk.org/jdk/pull/17350/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=04 Stats: 225 lines in 9 files changed: 194 ins; 8 del; 23 mod Patch: https://git.openjdk.org/jdk/pull/17350.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17350/head:pull/17350 PR: https://git.openjdk.org/jdk/pull/17350 From matsaave at openjdk.org Mon Mar 11 20:40:24 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 11 Mar 2024 20:40:24 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v6] In-Reply-To: References: Message-ID: > We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time > > This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. > > Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. > > Metrics, courtesy of @iklam : > ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 > Oopmap = 15872 bytes > > calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 > Oopmap = 10496 bytes > > (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 > More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. > > $ java -Xshare:dump -Xlog:cds+map | grep lead > [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 > [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 > > So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. > > Current output: > $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead > [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 > [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Removed metaspaceshared changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17350/files - new: https://git.openjdk.org/jdk/pull/17350/files/a09cd64a..bdd74ec9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=04-05 Stats: 26 lines in 5 files changed: 0 ins; 18 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/17350.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17350/head:pull/17350 PR: https://git.openjdk.org/jdk/pull/17350 From sspitsyn at openjdk.org Tue Mar 12 01:52:29 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 01:52:29 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v24] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: improved new test: added wakeup warning; polished test ouput ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17680/files - new: https://git.openjdk.org/jdk/pull/17680/files/b97b8205..94ebf725 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=22-23 Stats: 61 lines in 2 files changed: 45 ins; 4 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From sspitsyn at openjdk.org Tue Mar 12 01:58:21 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 01:58:21 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 18:08:01 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - Merge >> - review: minor tweak in test description of ObjectMonitorUsage.java >> - review: addressed more comments on the fix and new test >> - rename after merge: jvmti_common.h to jvmti_common.hpp >> - Merge >> - review: update comment in threads.hpp >> - fix deadlock with carrier threads starvation in ObjectMonitorUsage test >> - resolve merge conflict for deleted file objmonusage003.cpp >> - fix a typo in libObjectMonitorUsage.cpp >> - fix potential sync gap in the test ObjectMonitorUsage >> - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 > > src/hotspot/share/runtime/threads.cpp line 1186: > >> 1184: >> 1185: #if INCLUDE_JVMTI >> 1186: // Get count Java threads that are waiting to enter or re-enter the specified monitor. > > nit typo: s/count Java/count of Java/ Thanks, fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520714780 From sspitsyn at openjdk.org Tue Mar 12 02:07:24 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:07:24 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 18:05:00 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - Merge >> - review: minor tweak in test description of ObjectMonitorUsage.java >> - review: addressed more comments on the fix and new test >> - rename after merge: jvmti_common.h to jvmti_common.hpp >> - Merge >> - review: update comment in threads.hpp >> - fix deadlock with carrier threads starvation in ObjectMonitorUsage test >> - resolve merge conflict for deleted file objmonusage003.cpp >> - fix a typo in libObjectMonitorUsage.cpp >> - fix potential sync gap in the test ObjectMonitorUsage >> - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1483: > >> 1481: markWord mark = hobj->mark(); >> 1482: ResourceMark rm(current_thread); >> 1483: GrowableArray* wantList = nullptr; > > Thanks for refactoring `JvmtiEnvBase::get_object_monitor_usage()` to > be much more clear. It was difficult to review in GitHub so I switched to > the webrev frames view so that I could scroll the two sides back and forth > and determine equivalence. Good! > src/hotspot/share/runtime/threads.hpp line 134: > >> 132: static unsigned print_threads_compiling(outputStream* st, char* buf, int buflen, bool short_form = false); >> 133: >> 134: // Get count Java threads that are waiting to enter or re-enter the specified monitor. > > nit typo: s/count Java/count of Java/ > > Also this file needs a copyright year update. Thanks, fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520721501 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520722461 From sspitsyn at openjdk.org Tue Mar 12 02:07:24 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:07:24 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v22] In-Reply-To: <_uXZSNczXm2ODdcowDzJZleDqNWUL6o97XAX10pWnBA=.9641d2c4-2738-422a-b882-0c3c9b408ad1@github.com> References: <4XAkV4CDJUTXFeKCCkAQlD_HntnLWwd_AqcTEMowS4k=.f5d2572f-3061-4a56-bf9d-e97bda57e3a0@github.com> <5lsZBFSbWmDQVJe9tGr2Rll00SWkbU7HBuRpEC_-Duo=.167e62b6-ca47-4a93-b8ad-7c22b379221d@github.com> <_uXZSNczXm2ODdcowDzJZleDqNWUL6o97XAX10pWnBA=.9641d2c4-2738-422a-b882-0c3c9b408ad1@github.com> Message-ID: On Mon, 11 Mar 2024 01:53:09 GMT, David Holmes wrote: >> Okay. Now I see why the old code did not have endless loops. > > Apologies I should have checked the code - I'd forgotten the list is circular to avoid needing an explicit tail pointer. Okay, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520719906 From sspitsyn at openjdk.org Tue Mar 12 02:07:25 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:07:25 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v24] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 18:14:49 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> improved new test: added wakeup warning; polished test ouput > > test/hotspot/jtreg/TEST.quick-groups line 972: > >> 970: vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage001/TestDescription.java \ >> 971: vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage002/TestDescription.java \ >> 972: vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage003/TestDescription.java \ > > This file needs a copyright year update. Thanks, fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520723644 From sspitsyn at openjdk.org Tue Mar 12 02:07:26 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:07:26 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v21] In-Reply-To: References: Message-ID: <77pfDJrhxZa6RsNkDXdcHvdfUGd7alHVMd1sZnEJYI8=.b837e747-d824-4949-afef-49447c913e2f@github.com> On Mon, 11 Mar 2024 18:38:52 GMT, Daniel D. Daugherty wrote: >> Okay, thanks! I'll add some diagnostic code to catch spurious wakups. >> Let's see if we ever encounter any spurious wakeup in this test. > > I think some sort of comment needs to be added here to document > the possibility of this code path being affect by a spurious wakeup. Thanks. I've added both a comment and also some spurious wakeup detection with a warning. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520719450 From sspitsyn at openjdk.org Tue Mar 12 02:10:21 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:10:21 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 18:43:11 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - Merge >> - review: minor tweak in test description of ObjectMonitorUsage.java >> - review: addressed more comments on the fix and new test >> - rename after merge: jvmti_common.h to jvmti_common.hpp >> - Merge >> - review: update comment in threads.hpp >> - fix deadlock with carrier threads starvation in ObjectMonitorUsage test >> - resolve merge conflict for deleted file objmonusage003.cpp >> - fix a typo in libObjectMonitorUsage.cpp >> - fix potential sync gap in the test ObjectMonitorUsage >> - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 202: > >> 200: LOG("(%d) notify_waiter_count expected: %d, actually: %d\n", >> 201: check_idx, notifyWaiterCount, inf.notify_waiter_count); >> 202: result = STATUS_FAILED; > > Please consider prefixing these LOG output lines with "FAILED:" so that it > is easier to identify the log output where failure info is shown. Good suggestion. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520726685 From sspitsyn at openjdk.org Tue Mar 12 02:16:23 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:16:23 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: <24RgaSSfkMJk3SYQ6LZDCQToWvafNxhv44AgM_5Wlbk=.5b8b2051-8952-436f-a451-32e12b5b9b05@github.com> On Mon, 11 Mar 2024 18:44:34 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - Merge >> - review: minor tweak in test description of ObjectMonitorUsage.java >> - review: addressed more comments on the fix and new test >> - rename after merge: jvmti_common.h to jvmti_common.hpp >> - Merge >> - review: update comment in threads.hpp >> - fix deadlock with carrier threads starvation in ObjectMonitorUsage test >> - resolve merge conflict for deleted file objmonusage003.cpp >> - fix a typo in libObjectMonitorUsage.cpp >> - fix potential sync gap in the test ObjectMonitorUsage >> - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/libObjectMonitorUsage.cpp line 215: > >> 213: if (tested_monitor != nullptr) { >> 214: jni->DeleteGlobalRef(tested_monitor); >> 215: } > > Since this is at the beginning of `setTestedMonitor` does it mean that we "leak" > the last global ref? Thank you for the check. There is no leak of the last global ref because there is always this call at the end of each sub-test: `setTestedMonitor(null)`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520733090 From sspitsyn at openjdk.org Tue Mar 12 02:19:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:19:22 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v23] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 18:23:25 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 26 commits: >> >> - Merge >> - review: minor tweak in test description of ObjectMonitorUsage.java >> - review: addressed more comments on the fix and new test >> - rename after merge: jvmti_common.h to jvmti_common.hpp >> - Merge >> - review: update comment in threads.hpp >> - fix deadlock with carrier threads starvation in ObjectMonitorUsage test >> - resolve merge conflict for deleted file objmonusage003.cpp >> - fix a typo in libObjectMonitorUsage.cpp >> - fix potential sync gap in the test ObjectMonitorUsage >> - ... and 16 more: https://git.openjdk.org/jdk/compare/de428daf...b97b8205 > > test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage001/TestDescription.java line 46: > >> 44: * @library /vmTestbase >> 45: * /test/lib >> 46: * @run main/othervm/native -agentlib:objmonusage001=printdump nsk.jvmti.GetObjectMonitorUsage.objmonusage001 > > I only included this `=printdump` for debug when I was updating the test. > You don't have to keep it. Thanks, removed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520736552 From sspitsyn at openjdk.org Tue Mar 12 02:31:43 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 02:31:43 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v25] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: addressed minor comments, updated a couple of copyright headers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17680/files - new: https://git.openjdk.org/jdk/pull/17680/files/94ebf725..94f30f16 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=23-24 Stats: 13 lines in 6 files changed: 2 ins; 1 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From sspitsyn at openjdk.org Tue Mar 12 03:40:19 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 03:40:19 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v25] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 02:31:43 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: addressed minor comments, updated a couple of copyright headers Thank you for review, Dan! I believe, all you comments have been addressed now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17680#issuecomment-1990070280 From iklam at openjdk.org Tue Mar 12 05:23:18 2024 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 12 Mar 2024 05:23:18 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v6] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 20:40:24 GMT, Matias Saavedra Silva wrote: >> We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time >> >> This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. >> >> Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. >> >> Metrics, courtesy of @iklam : >> ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 >> Oopmap = 15872 bytes >> >> calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 >> Oopmap = 10496 bytes >> >> (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 >> More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. >> >> $ java -Xshare:dump -Xlog:cds+map | grep lead >> [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 >> [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 >> >> So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. >> >> Current output: >> $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead >> [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 >> [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Removed metaspaceshared changes LGTM. Just some small nits. src/hotspot/share/cds/filemap.cpp line 1584: > 1582: > 1583: assert(new_zeros == 0, "Should have removed leading zeros"); > 1584: assert(map->size_in_bytes() <= old_size, "Map size should have decreased"); The comment doesn't match the code, since you are comparing with `<=`. I think it's safe to change to `<` as we should have removed more than 64 leading bits, due to the abundance of primitive arrays. src/hotspot/share/cds/heapShared.cpp line 1159: > 1157: WalkOopAndArchiveClosure* WalkOopAndArchiveClosure::_current = nullptr; > 1158: > 1159: class PointsToOopsChecker : public BasicOopIterateClosure { This class needs a short comment. How about: // Checks if an oop has any non-null oop fields ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17350#pullrequestreview-1930118444 PR Review Comment: https://git.openjdk.org/jdk/pull/17350#discussion_r1520863009 PR Review Comment: https://git.openjdk.org/jdk/pull/17350#discussion_r1520866317 From dholmes at openjdk.org Tue Mar 12 05:23:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 12 Mar 2024 05:23:21 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v25] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 02:31:43 GMT, Serguei Spitsyn wrote: >> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: addressed minor comments, updated a couple of copyright headers Changes requested by dholmes (Reviewer). test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 352: > 350: wokeupCount++; > 351: if (!notifiedAll && notifiedCount < wokeupCount) { > 352: wasSpuriousWakeup = true; This doesn't work. First as you've realized you cannot detect a spurious wakeup when notifyAll is used. But second the `notifiedCount` is incremented on each `notify` whilst the monitor lock is still held - as it must be. So no `wait` can return until the monitor is unlocked. So unless the spurious wakeup occurs before the monitor is acquired at line 254, all notifications must be completed before even a spurious wakeup can cause `wait()` to return - at which point `notifiedCount` must equal ` NUMBER_OF_WAITING_THREADS` which can never be < `wokeupCount`. Given you can only detect one very specific case of a spurious wakeup, all this extra logic is just adding to the complexity and giving a false impression about what is being checked. To detect actual spurious wakeups here you need to be able to track when each thread was chosen by a `notify()` and there is no means to do that. ------------- PR Review: https://git.openjdk.org/jdk/pull/17680#pullrequestreview-1930117927 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1520862516 From jpai at openjdk.org Tue Mar 12 06:42:13 2024 From: jpai at openjdk.org (Jaikiran Pai) Date: Tue, 12 Mar 2024 06:42:13 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v3] In-Reply-To: <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> Message-ID: On Sun, 10 Mar 2024 13:47:06 GMT, Eirik Bj?rsn?s wrote: >> Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. >> >> These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. >> >> Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. >> >> This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html >> >> Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. >> >> Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. > > Eirik Bj?rsn?s has updated the pull request incrementally with two additional commits since the last revision: > > - Replace calls to Unsafe.putObject with Unsafe.putReference > - Update a code comment referencing Unsafe.compareAndSetObject to reference Unsafe.compareAndSetReference instead Hello Eirik, Mandy, tier1, tier2, tier3 testing of the latest approved state of this PR passed without failures. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1990867492 From maxim.kartashev at jetbrains.com Tue Mar 12 07:12:00 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Tue, 12 Mar 2024 11:12:00 +0400 Subject: SafeFetch leads to hard crash on macOS 14.4 Message-ID: Hello! This has been recently filed as https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to check with the community if any further info on the issue is available. It looks like in some cases SafeFetch is directed to fetch something the OS really doesn't want it to and, instead of promoting the error to a signal and letting the application (JVM) deal with it, immediately terminates the application. All we've got is the OS crash report, not even the JVM's fatal error log. This looks like an application security precaution, which I am not at all familiar with. The relevant pieces of the crash log are below. Is anybody familiar with "Namespace GUARD" termination reason and maybe other related novelties of macOS 14.4? The error was not reported before upgrading to 14.4 Thanks in advance, Maxim. 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, BasicLock*, JavaThread*) + 392 ... Exception Type: EXC_BAD_ACCESS (SIGKILL) Exception Codes: KERN_PROTECTION_FAILURE at 0x00000001004f4000 Exception Codes: 0x0000000000000002, 0x00000001004f4000 Termination Reason: Namespace GUARD, Code 5 VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; bytes after start: 0 bytes before end: 16383 REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL mapped file 1004e4000-1004f4000 [ 64K] r--/r-- SM=COW Object_id=fa8d88e7 ---> VM_ALLOCATE 1004f4000-1004f8000 [ 16K] ---/rwx SM=NUL VM_ALLOCATE 1004f8000-1004fc000 [ 16K] r--/rwx SM=PRV -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Tue Mar 12 08:34:34 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 12 Mar 2024 09:34:34 +0100 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: References: Message-ID: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> Hi Maxim, On 2024-03-12 08:12, Maxim Kartashev wrote: > Hello! > > This has been recently filed as > https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to check > with the community if any further info on the issue is available. > > It looks like in some cases SafeFetch is directed to fetch something > the OS really doesn't want it to and, instead of promoting the error > to a signal and letting the application (JVM) deal with it, > immediately terminates the application. All we've got is the OS crash > report, not even the JVM's fatal error log. This looks like an > application security precaution, which I am not at all familiar with. > > The relevant pieces of the crash log are below. Is anybody familiar > with "Namespace GUARD" termination reason and maybe other related > novelties of macOS 14.4? The error was not reported before upgrading > to 14.4 I don't have an answer for this, but a note below: > > Thanks in advance, > > Maxim. > > 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 > 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 > 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 > 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, > BasicLock*, JavaThread*) + 392 FYI: I think that this specific call to SafeFetch was recently removed by: https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3 https://bugs.openjdk.org/browse/JDK-8320317 Cheers, StefanK > ... > > Exception Type: ? ? ? ?EXC_BAD_ACCESS (SIGKILL) > Exception Codes: ? ? ? KERN_PROTECTION_FAILURE at 0x00000001004f4000 > Exception Codes: ? ? ? 0x0000000000000002, 0x00000001004f4000 > > Termination Reason: ? ?Namespace GUARD, Code 5 > > VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; ?bytes > after start: 0 ?bytes before end: 16383 > ? ? ? REGION TYPE ? ? ? ? ? ? ? ? ? ?START - END ? ? ? ? [ VSIZE] > PRT/MAX SHRMOD ?REGION DETAIL > ? ? ? mapped file ? ? ? ? ? ? ? ? 1004e4000-1004f4000 ? ?[ 64K] > r--/r-- SM=COW ?Object_id=fa8d88e7 > ---> ?VM_ALLOCATE ? ? ? ? ? ? ? ? 1004f4000-1004f8000 ? ?[ ? 16K] > ---/rwx SM=NUL > ? ? ? VM_ALLOCATE ? ? ? ? ? ? ? ? 1004f8000-1004fc000 ? ?[ 16K] > r--/rwx SM=PRV From sspitsyn at openjdk.org Tue Mar 12 08:58:25 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 08:58:25 GMT Subject: Integrated: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage In-Reply-To: References: Message-ID: On Fri, 2 Feb 2024 00:44:00 GMT, Serguei Spitsyn wrote: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 This pull request has now been integrated. Changeset: b92440f9 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/b92440f9b1f41643bf9649ca192e405a9d6c026a Stats: 1180 lines in 15 files changed: 741 ins; 390 del; 49 mod 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage Reviewed-by: dcubed, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/17680 From sspitsyn at openjdk.org Tue Mar 12 08:54:45 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 08:54:45 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v26] In-Reply-To: References: Message-ID: > The implementation of the JVM TI `GetObjectMonitorUsage` does not match the spec. > The function returns the following structure: > > > typedef struct { > jthread owner; > jint entry_count; > jint waiter_count; > jthread* waiters; > jint notify_waiter_count; > jthread* notify_waiters; > } jvmtiMonitorUsage; > > > The following four fields are defined this way: > > waiter_count [jint] The number of threads waiting to own this monitor > waiters [jthread*] The waiter_count waiting threads > notify_waiter_count [jint] The number of threads waiting to be notified by this monitor > notify_waiters [jthread*] The notify_waiter_count threads waiting to be notified > > The `waiters` has to include all threads waiting to enter the monitor or to re-enter it in `Object.wait()`. > The implementation also includes the threads waiting to be notified in `Object.wait()` which is wrong. > The `notify_waiters` has to include all threads waiting to be notified in `Object.wait()`. > The implementation also includes the threads waiting to re-enter the monitor in `Object.wait()` which is wrong. > This update makes it right. > > The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to keep the existing behavior of this command. > > The follwoing JVMTI vmTestbase tests are fixed to adopt to the `GetObjectMonitorUsage()` correct behavior: > > jvmti/GetObjectMonitorUsage/objmonusage001 > jvmti/GetObjectMonitorUsage/objmonusage003 > > > The following JVMTI JCK tests have to be fixed to adopt to correct behavior: > > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html > vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html > > > > A JCK bug will be filed and the tests have to be added into the JCK problem list located in the closed repository. > > Also, please see and review the related CSR: > [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect implementation of JVM TI GetObjectMonitorUsage > > The Release-Note is: > [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: incorrect implementation of JVM TI GetObjectMonitorUsage > > Testing: > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: removed incorrect spurious wakeup detection ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17680/files - new: https://git.openjdk.org/jdk/pull/17680/files/94f30f16..effd0c17 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17680&range=24-25 Stats: 25 lines in 1 file changed: 1 ins; 24 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17680.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17680/head:pull/17680 PR: https://git.openjdk.org/jdk/pull/17680 From sspitsyn at openjdk.org Tue Mar 12 08:54:45 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 12 Mar 2024 08:54:45 GMT Subject: RFR: 8247972: incorrect implementation of JVM TI GetObjectMonitorUsage [v25] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 05:13:54 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: addressed minor comments, updated a couple of copyright headers > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java line 352: > >> 350: wokeupCount++; >> 351: if (!notifiedAll && notifiedCount < wokeupCount) { >> 352: wasSpuriousWakeup = true; > > This doesn't work. First as you've realized you cannot detect a spurious wakeup when notifyAll is used. But second the `notifiedCount` is incremented on each `notify` whilst the monitor lock is still held - as it must be. So no `wait` can return until the monitor is unlocked. So unless the spurious wakeup occurs before the monitor is acquired at line 254, all notifications must be completed before even a spurious wakeup can cause `wait()` to return - at which point `notifiedCount` must equal ` NUMBER_OF_WAITING_THREADS` which can never be < `wokeupCount`. > Given you can only detect one very specific case of a spurious wakeup, all this extra logic is just adding to the complexity and giving a false impression about what is being checked. To detect actual spurious wakeups here you need to be able to track when each thread was chosen by a `notify()` and there is no means to do that. You are right, thanks! Removed the incorrect spurious wakeup detection code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1521073727 From syan at openjdk.org Tue Mar 12 09:11:22 2024 From: syan at openjdk.org (SendaoYan) Date: Tue, 12 Mar 2024 09:11:22 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after 8325139 Message-ID: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kervel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. ------------- Commit messages: - 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after 8325139 Changes: https://git.openjdk.org/jdk/pull/18225/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18225&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327946 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18225.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18225/head:pull/18225 PR: https://git.openjdk.org/jdk/pull/18225 From stefan.karlsson at oracle.com Tue Mar 12 10:19:13 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 12 Mar 2024 11:19:13 +0100 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> Message-ID: <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> Hi again, I just want to clarify that the change above just removes *one* usage of SafeFetch. My guess is that all other usages of SafeFetch is still affected by the issue you are seeing. It would be interesting to see if you could verify this problem by applying this patch: diff --git a/src/hotspot/share/memory/universe.cpp b/src/hotspot/share/memory/universe.cpp index 2a3d532725f..b58c52e415b 100644 --- a/src/hotspot/share/memory/universe.cpp +++ b/src/hotspot/share/memory/universe.cpp @@ -842,7 +842,14 @@ jint universe_init() { ?? return JNI_OK; ?} +#include "runtime/safefetch.hpp" + ?jint Universe::initialize_heap() { +? char* mem = os::reserve_memory(16 * K, false, mtGC); +? intptr_t value = SafeFetchN((intptr_t*)mem, 1); + +? log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, p2i(mem), value); + ?? assert(_collectedHeap == nullptr, "Heap already created"); ?? _collectedHeap = GCConfig::arguments()->create_heap(); and running: build//jdk/bin/java -version This works on 14.3.1, but it would be interesting to see what happens on 14.4, and if that shuts down the VM before it is printing the version string. StefanK On 2024-03-12 09:34, Stefan Karlsson wrote: > Hi Maxim, > > On 2024-03-12 08:12, Maxim Kartashev wrote: >> Hello! >> >> This has been recently filed as >> https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to >> check with the community if any further info on the issue is available. >> >> It looks like in some cases SafeFetch is directed to fetch something >> the OS really doesn't want it to and, instead of promoting the error >> to a signal and letting the application (JVM) deal with it, >> immediately terminates the application. All we've got is the OS crash >> report, not even the JVM's fatal error log. This looks like an >> application security precaution, which I am not at all familiar with. >> >> The relevant pieces of the crash log are below. Is anybody familiar >> with "Namespace GUARD" termination reason and maybe other related >> novelties of macOS 14.4? The error was not reported before upgrading >> to 14.4 > > I don't have an answer for this, but a note below: > >> >> Thanks in advance, >> >> Maxim. >> >> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 >> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 >> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 >> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, >> BasicLock*, JavaThread*) + 392 > > FYI: I think that this specific call to SafeFetch was recently removed > by: > https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3 > > https://bugs.openjdk.org/browse/JDK-8320317 > > Cheers, > StefanK > > >> ... >> >> Exception Type: ? ? ? ?EXC_BAD_ACCESS (SIGKILL) >> Exception Codes: ? ? ? KERN_PROTECTION_FAILURE at 0x00000001004f4000 >> Exception Codes: ? ? ? 0x0000000000000002, 0x00000001004f4000 >> >> Termination Reason: ? ?Namespace GUARD, Code 5 >> >> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; ?bytes >> after start: 0 ?bytes before end: 16383 >> ? ? ? REGION TYPE ? ? ? ? ? ? ? ? ? ?START - END ? ? ? ? [ VSIZE] >> PRT/MAX SHRMOD ?REGION DETAIL >> ? ? ? mapped file ? ? ? ? ? ? ? ? 1004e4000-1004f4000 ? ?[ 64K] >> r--/r-- SM=COW ?Object_id=fa8d88e7 >> ---> ?VM_ALLOCATE ? ? ? ? ? ? ? ? 1004f4000-1004f8000 ? ?[ 16K] >> ---/rwx SM=NUL >> ? ? ? VM_ALLOCATE ? ? ? ? ? ? ? ? 1004f8000-1004fc000 ? ?[ 16K] >> r--/rwx SM=PRV > From maxim.kartashev at jetbrains.com Tue Mar 12 11:38:55 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Tue, 12 Mar 2024 15:38:55 +0400 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> Message-ID: > I just want to clarify that the change above just removes *one* usage of > SafeFetch. My guess is that all other usages of SafeFetch is still > affected by the issue you are seeing. For whatever reason, it only crashed in ObjectMonitor::TrySpin(). Perhaps, it's just the first one that is reached. Anyway, I'll check the patch and get back to you. On Tue, Mar 12, 2024 at 2:18?PM Stefan Karlsson wrote: > Hi again, > > I just want to clarify that the change above just removes *one* usage of > SafeFetch. My guess is that all other usages of SafeFetch is still > affected by the issue you are seeing. > > It would be interesting to see if you could verify this problem by > applying this patch: > > diff --git a/src/hotspot/share/memory/universe.cpp > b/src/hotspot/share/memory/universe.cpp > index 2a3d532725f..b58c52e415b 100644 > --- a/src/hotspot/share/memory/universe.cpp > +++ b/src/hotspot/share/memory/universe.cpp > @@ -842,7 +842,14 @@ jint universe_init() { > return JNI_OK; > } > > +#include "runtime/safefetch.hpp" > + > jint Universe::initialize_heap() { > + char* mem = os::reserve_memory(16 * K, false, mtGC); > + intptr_t value = SafeFetchN((intptr_t*)mem, 1); > + > + log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, > p2i(mem), value); > + > assert(_collectedHeap == nullptr, "Heap already created"); > _collectedHeap = GCConfig::arguments()->create_heap(); > > > and running: > build//jdk/bin/java -version > > This works on 14.3.1, but it would be interesting to see what happens on > 14.4, and if that shuts down the VM before it is printing the version > string. > > StefanK > > On 2024-03-12 09:34, Stefan Karlsson wrote: > > Hi Maxim, > > > > On 2024-03-12 08:12, Maxim Kartashev wrote: > >> Hello! > >> > >> This has been recently filed as > >> https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to > >> check with the community if any further info on the issue is available. > >> > >> It looks like in some cases SafeFetch is directed to fetch something > >> the OS really doesn't want it to and, instead of promoting the error > >> to a signal and letting the application (JVM) deal with it, > >> immediately terminates the application. All we've got is the OS crash > >> report, not even the JVM's fatal error log. This looks like an > >> application security precaution, which I am not at all familiar with. > >> > >> The relevant pieces of the crash log are below. Is anybody familiar > >> with "Namespace GUARD" termination reason and maybe other related > >> novelties of macOS 14.4? The error was not reported before upgrading > >> to 14.4 > > > > I don't have an answer for this, but a note below: > > > >> > >> Thanks in advance, > >> > >> Maxim. > >> > >> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 > >> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 > >> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 > >> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, > >> BasicLock*, JavaThread*) + 392 > > > > FYI: I think that this specific call to SafeFetch was recently removed > > by: > > > https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3 > > > > https://bugs.openjdk.org/browse/JDK-8320317 > > > > Cheers, > > StefanK > > > > > >> ... > >> > >> Exception Type: EXC_BAD_ACCESS (SIGKILL) > >> Exception Codes: KERN_PROTECTION_FAILURE at 0x00000001004f4000 > >> Exception Codes: 0x0000000000000002, 0x00000001004f4000 > >> > >> Termination Reason: Namespace GUARD, Code 5 > >> > >> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; bytes > >> after start: 0 bytes before end: 16383 > >> REGION TYPE START - END [ VSIZE] > >> PRT/MAX SHRMOD REGION DETAIL > >> mapped file 1004e4000-1004f4000 [ 64K] > >> r--/r-- SM=COW Object_id=fa8d88e7 > >> ---> VM_ALLOCATE 1004f4000-1004f8000 [ 16K] > >> ---/rwx SM=NUL > >> VM_ALLOCATE 1004f8000-1004fc000 [ 16K] > >> r--/rwx SM=PRV > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbaesken at openjdk.org Tue Mar 12 13:25:14 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 12 Mar 2024 13:25:14 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v6] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 09:07:23 GMT, Suchismith Roy wrote: >> Remove redefinition of NULL for XLC compiler >> In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). >> Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. >> >> >> JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18064#pullrequestreview-1931052163 From ddong at openjdk.org Tue Mar 12 13:35:13 2024 From: ddong at openjdk.org (Denghui Dong) Date: Tue, 12 Mar 2024 13:35:13 GMT Subject: RFR: 8326012: JFR: Event for time to safepoint [v9] In-Reply-To: References: <68hS0kQgtDIk4ioAJj_r0_GLT6h0lcif6Daj6WRwxlI=.40c2a6e7-70a8-4954-bcde-9318ee311028@github.com> Message-ID: On Tue, 27 Feb 2024 09:53:26 GMT, Erik Gahlin wrote: >> Any thoughts? > >> Any thoughts? > > I haven't look at the implementation, mostly interested in the design for now. Should thread-local handshake be included in this event, or is it better done separately, or not at all? Hi @egahlin, Do you have time to review this patch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17888#issuecomment-1991662378 From ihse at openjdk.org Tue Mar 12 13:54:16 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 13:54:16 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 18:12:40 GMT, Jorn Vernee wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Update line number for dereference_null in TestDwarf > > test/jdk/java/foreign/CallGeneratorHelper.java line 216: > >> 214: if (header) { >> 215: System.out.println( >> 216: "#include \"export.h\"\n" > > We don't generate these header files any more, so the changes to this file are not really needed. I still wouldn't like to keep the bad hard-coded defines. Are you okay with me pushing these changes, and then you can remove the parts of the test that are not actually used anymore? If the code is not used it should not matter much to you either way. (I mean I could back out these changes, but then we'd have the bad code in place while waiting for you to remove it, putting pressure on you to actually remove it.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18135#discussion_r1521503942 From ihse at openjdk.org Tue Mar 12 13:59:15 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 13:59:15 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code In-Reply-To: <_YFd8mgBis8mgRwfpafttLS2y1_-Aiy5aIdLr2eCXv8=.4890fb5d-cc8a-4cd8-a53d-3f18aa1dfbe6@github.com> References: <_YFd8mgBis8mgRwfpafttLS2y1_-Aiy5aIdLr2eCXv8=.4890fb5d-cc8a-4cd8-a53d-3f18aa1dfbe6@github.com> Message-ID: On Mon, 11 Mar 2024 06:57:42 GMT, Alan Bateman wrote: > > test/jdk/java/lang/Thread/jni/AttachCurrentThread/libImplicitAttach.c was missing an export. This had not been discovered before since that file was not compiled on Windows. > It's a Linux/macOS specific test so it wasn't needed. Yeah. I understand. I just wanted to give an explanation for why this particular test needed changes that were not present elsewhere (since other exported methods all were tested on Windows). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1991712352 From ihse at openjdk.org Tue Mar 12 13:59:15 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 13:59:15 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: <-9V3vdX_4NWhrgoEYdqy9W-PyT8HBwjKyMwF-IzM9Uo=.c5f3e91a-017e-42c1-abc3-ab90c94f3b75@github.com> References: <-9V3vdX_4NWhrgoEYdqy9W-PyT8HBwjKyMwF-IzM9Uo=.c5f3e91a-017e-42c1-abc3-ab90c94f3b75@github.com> Message-ID: On Mon, 11 Mar 2024 06:57:58 GMT, Alan Bateman wrote: > Good cleanup. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18135#issuecomment-1991712859 From ihse at openjdk.org Tue Mar 12 13:59:16 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 13:59:16 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 02:31:02 GMT, David Holmes wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Update line number for dereference_null in TestDwarf > > test/hotspot/jtreg/runtime/ErrorHandling/libTestDwarfHelper.h line 24: > >> 22: */ >> 23: >> 24: #include > > Seems unneeded. So what I did here was change: #include "jni.h" #include into #include #include "export.h" #include "jni.h" The reordering was strictly not needed, but putting user includes in front of system ones looked like bad coding to me, and put me in a difficult spot in where to add the new `#include "export.h"` -- next to the current user include even if I thought that was wrong, or have the system includes "sandwitched" between two user includes. Or do you mean that it seems unneeded to include `jni.h` at all? Yes, I agree, but it was there before, and I don't want to make any other changes to the tests. This change is scary enough as it is :-). If you want, I can file a follow-up to remove this include instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18135#discussion_r1521510510 From pchilanomate at openjdk.org Tue Mar 12 14:12:15 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 12 Mar 2024 14:12:15 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v2] In-Reply-To: References: Message-ID: On Tue, 5 Mar 2024 13:07:02 GMT, Fredrik Bredberg wrote: >> The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: >> >> >> [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 >> [6.693s][trace][continuations] local 0 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 >> [6.693s][trace][continuations] - 3 locals 3 max stack >> [6.693s][trace][continuations] - codelet: return entry points >> [6.693s][trace][continuations] saved fp >> [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address >> [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 >> [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp >> [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) >> [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method >> [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp >> [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) >> [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache >> [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) >> [6.693s][trace][continuatio... > > Fredrik Bredberg has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8314508_relativized_ptr_frame_describe > - Small white space fix > - 8314508: Improve how relativized pointers are printed by frame::describe Not a fan of printing a value that differs from the real?one but I like that we still get it directly from the new description you added. I see that we cannot do this derelativization for the fp of a heap frame since to know if it was actually relativized we need the caller frame info to check if it's interpreted. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18102#pullrequestreview-1931184705 From ihse at openjdk.org Tue Mar 12 15:16:30 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:16:30 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v3] In-Reply-To: References: Message-ID: <0TOkCbNH7MaB96StWPTbQZUzo2mT-9jzCSpNlqg-PJM=.4f227516-fe08-4e6b-a5db-63d39e29a4fe@github.com> > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. Magnus Ihse Bursie has updated the pull request incrementally with two additional commits since the last revision: - Set JVM_PICFLAG="-fpic -mcmodel=large" for AIX - Open XL 17.1.1.4 is clang 15. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18172/files - new: https://git.openjdk.org/jdk/pull/18172/files/53a05019..e7949499 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=01-02 Stats: 7 lines in 3 files changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18172.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18172/head:pull/18172 PR: https://git.openjdk.org/jdk/pull/18172 From ihse at openjdk.org Tue Mar 12 15:16:30 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:16:30 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 09:30:20 GMT, Julian Waters wrote: > Is Open XL C/C++ considered a compiler or more akin to a development environment like Xcode is for macOS? Depending on which, we could just say clang is the compiler for AIX without needing to say that Open XL is treated like clang, etc You are raising a good point. Our current concept of "toolchain" is not working as smoothly as it should be. We should probably split it into "toolchain environment" (or whatever), like Xcode or Open XL, and "toolchain compiler" (like clang), and/or possibly also some concept of "toolchain sdk". The work you are doing of untangling Windows from Visual Studio would probably influence such a redesign, seeing what part of the Windows adaptions which arise from cl, and which are constant given that you compile for windows (like the Windows SDK). But this is a more architectural issue that will need to be resolved in another PR, and probably discussed quite a bit beforehand. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521637464 From ihse at openjdk.org Tue Mar 12 15:16:30 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:16:30 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:04:56 GMT, Magnus Ihse Bursie wrote: >> Is Open XL C/C++ considered a compiler or more akin to a development environment like Xcode is for macOS? Depending on which, we could just say clang is the compiler for AIX without needing to say that Open XL is treated like clang, etc >> >> Also, why did this remove the link to the Supported Build Platforms page? > >> Is Open XL C/C++ considered a compiler or more akin to a development environment like Xcode is for macOS? Depending on which, we could just say clang is the compiler for AIX without needing to say that Open XL is treated like clang, etc > > You are raising a good point. Our current concept of "toolchain" is not working as smoothly as it should be. We should probably split it into "toolchain environment" (or whatever), like Xcode or Open XL, and "toolchain compiler" (like clang), and/or possibly also some concept of "toolchain sdk". > > The work you are doing of untangling Windows from Visual Studio would probably influence such a redesign, seeing what part of the Windows adaptions which arise from cl, and which are constant given that you compile for windows (like the Windows SDK). > > But this is a more architectural issue that will need to be resolved in another PR, and probably discussed quite a bit beforehand. > why did this remove the link to the Supported Build Platforms page? You make it sound like it is a bad thing, but in fact I promoted the AIX build to be of the same status as all other platforms. I cowardly did not put anything specific about the AIX build in the readme before, but just referred to the wiki. Now I decided we should actually treat this as a proper platform and describe the compiler version in the readme, since it is checked in configure, just like all other platforms. There is still a link to the Supported Build Platforms wiki page in the general part of the readme. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521642995 From ihse at openjdk.org Tue Mar 12 15:16:30 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:16:30 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:07:15 GMT, Magnus Ihse Bursie wrote: >>> Is Open XL C/C++ considered a compiler or more akin to a development environment like Xcode is for macOS? Depending on which, we could just say clang is the compiler for AIX without needing to say that Open XL is treated like clang, etc >> >> You are raising a good point. Our current concept of "toolchain" is not working as smoothly as it should be. We should probably split it into "toolchain environment" (or whatever), like Xcode or Open XL, and "toolchain compiler" (like clang), and/or possibly also some concept of "toolchain sdk". >> >> The work you are doing of untangling Windows from Visual Studio would probably influence such a redesign, seeing what part of the Windows adaptions which arise from cl, and which are constant given that you compile for windows (like the Windows SDK). >> >> But this is a more architectural issue that will need to be resolved in another PR, and probably discussed quite a bit beforehand. > >> why did this remove the link to the Supported Build Platforms page? > > You make it sound like it is a bad thing, but in fact I promoted the AIX build to be of the same status as all other platforms. I cowardly did not put anything specific about the AIX build in the readme before, but just referred to the wiki. Now I decided we should actually treat this as a proper platform and describe the compiler version in the readme, since it is checked in configure, just like all other platforms. > > There is still a link to the Supported Build Platforms wiki page in the general part of the readme. > It is clang 15 not 13. Clang 13 was in 17.1.0 Thanks! Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521646622 From ihse at openjdk.org Tue Mar 12 15:16:30 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:16:30 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v3] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 13:45:46 GMT, Joachim Kern wrote: >> OK this was a flaw in my introduction of clang toolchain for AIX. The intention was to keep the xlc options in form of their clang counterparts. I will try with a corrected version for clang on AIX and will come back to you. > > OK, the `-Wl,-bbigtoc` is not a compiler option but a linker option and it is already set in the linker options. > But the `-fpic -mcmodel=large` should be set to avoid creating a jump to out-of-order code. > > So we can replace the > > JVM_PICFLAG="$PICFLAG" > JDK_PICFLAG="$PICFLAG" > > > code some lines below by > > > if test "x$TOOLCHAIN_TYPE" = xclang && test "x$OPENJDK_TARGET_OS" = xaix; then > JVM_PICFLAG="-fpic -mcmodel=large" > else > JVM_PICFLAG="$PICFLAG" > fi > JDK_PICFLAG="$PICFLAG" I'm not entirely happy to do this here, as it is a bug fix, and it will change behavior, in contrast to the rest of the patch which is just about removing unused code. But it's a small fix, and I'm messing with the part of the code anyway, so if you are sure that this is the right thing to do I'll sneak it in here. Please make sure it works. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521652510 From ihse at openjdk.org Tue Mar 12 15:19:18 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:19:18 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Mon, 11 Mar 2024 09:14:27 GMT, Joachim Kern wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert SEARCH_PATH changes > > make/autoconf/toolchain.m4 line 409: > >> 407: # Target: powerpc-ibm-aix7.2.0.0 >> 408: # Thread model: posix >> 409: # InstalledDir: /opt/IBM/openxlC/17.1.0/bin > > Please correct: > IBM Open XL C/C++ for AIX 17.1.**1** (xxxxxxxxxxxxxxx), clang version 1**5**.0.0 > # Target: **powerpc-ibm-aix7.2.**5**.7** > # Thread model: posix > # InstalledDir: /opt/IBM/openxlC/17.1.**1**/bin That was just intended as an example code. At one point, this has been printed by the compiler output (at least according to the source I found on the net). I tried to find a bunch of different examples from different platforms and versions. But sure, if it makes you happier, I can replace it with a more up to date example. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521656819 From ihse at openjdk.org Tue Mar 12 15:19:19 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:19:19 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:15:27 GMT, Magnus Ihse Bursie wrote: >> make/autoconf/toolchain.m4 line 409: >> >>> 407: # Target: powerpc-ibm-aix7.2.0.0 >>> 408: # Thread model: posix >>> 409: # InstalledDir: /opt/IBM/openxlC/17.1.0/bin >> >> Please correct: >> IBM Open XL C/C++ for AIX 17.1.**1** (xxxxxxxxxxxxxxx), clang version 1**5**.0.0 >> # Target: **powerpc-ibm-aix7.2.**5**.7** >> # Thread model: posix >> # InstalledDir: /opt/IBM/openxlC/17.1.**1**/bin > > That was just intended as an example code. At one point, this has been printed by the compiler output (at least according to the source I found on the net). I tried to find a bunch of different examples from different platforms and versions. > > But sure, if it makes you happier, I can replace it with a more up to date example. Actually, your example seems doctored (the `xxxxx`) string. I think I'll keep the real world example. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521659238 From ihse at openjdk.org Tue Mar 12 15:27:29 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:27:29 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v4] In-Reply-To: References: Message-ID: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. Magnus Ihse Bursie has updated the pull request incrementally with two additional commits since the last revision: - Replace CC_VERSION_OUTPUT with CC_VERSION_STRING - We need CC_VERSION_OUTPUT before we can check it ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18172/files - new: https://git.openjdk.org/jdk/pull/18172/files/e7949499..577715b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18172&range=02-03 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18172.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18172/head:pull/18172 PR: https://git.openjdk.org/jdk/pull/18172 From ihse at openjdk.org Tue Mar 12 15:27:29 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:27:29 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: Message-ID: <3S9pyv9jOnJihfgqtK_CXhDDV62pzek2TrduKffVTnI=.4e8d8016-a299-4596-92f4-1bcf6bd206f4@github.com> On Mon, 11 Mar 2024 12:44:55 GMT, Matthias Baesken wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert SEARCH_PATH changes > > With this change added, currently configure fails > > > checking for ibm-llvm-cxxfilt... /opt/IBM/openxlC/17.1.1/tools/ibm-llvm-cxxfilt > configure: error: ibm-clang_r version output check failed, output: > configure exiting with result code > > maybe related to what Joachim pointed out ? @MBaesken Yes, it was the bug Joachim found. It should be fixed now, together will all other comments (except the example version string in the comments). Please re-test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1991913029 From ihse at openjdk.org Tue Mar 12 15:27:29 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 15:27:29 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: <0yRLC1SM9um8MIZYJP6sf8OOMeOCbzCc1LF_JQLxZHg=.e3391f4e-ab9e-423c-880c-e84aba29e87b@github.com> References: <0yRLC1SM9um8MIZYJP6sf8OOMeOCbzCc1LF_JQLxZHg=.e3391f4e-ab9e-423c-880c-e84aba29e87b@github.com> Message-ID: <7IuMlu40vZGHuJYHQZvYgMVi72Hd72bn1RfkvD0gygg=.27838cfe-7ae7-42e7-ac46-b08b5610d979@github.com> On Mon, 11 Mar 2024 12:27:08 GMT, Joachim Kern wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Revert SEARCH_PATH changes > > make/autoconf/toolchain.m4 line 940: > >> 938: if test "x$OPENJDK_TARGET_OS" = xaix; then >> 939: # Make sure we have the Open XL version of clang on AIX >> 940: $ECHO "$CC_VERSION_OUTPUT" | $GREP "IBM Open XL C/C++ for AIX" > /dev/null > > This does not work since $CC_VERSION_OUTPUT is unset. We need > CC_VERSION_OUTPUT=`${XLC_TEST_PATH}ibm-clang++_r --version 2>&1 | $HEAD -n 1` > before, as in the previous code some lines above which you removed. Yeah, you are right. I confused it with CC_VERSION_STRING. ... duh. Actually, that is what I should be using instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18172#discussion_r1521667916 From jvernee at openjdk.org Tue Mar 12 15:49:13 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 12 Mar 2024 15:49:13 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 13:51:28 GMT, Magnus Ihse Bursie wrote: >> test/jdk/java/foreign/CallGeneratorHelper.java line 216: >> >>> 214: if (header) { >>> 215: System.out.println( >>> 216: "#include \"export.h\"\n" >> >> We don't generate these header files any more, so the changes to this file are not really needed. > > I still wouldn't like to keep the bad hard-coded defines. Are you okay with me pushing these changes, and then you can remove the parts of the test that are not actually used anymore? If the code is not used it should not matter much to you either way. > > (I mean I could back out these changes, but then we'd have the bad code in place while waiting for you to remove it, putting pressure on you to actually remove it.) Yes, that's fine. Sorry, I meant to file a JBS issue and come back with another comment last time, but I got distracted. Filed: https://bugs.openjdk.org/browse/JDK-8327994 now ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18135#discussion_r1521715617 From dchuyko at openjdk.org Tue Mar 12 15:53:39 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 12 Mar 2024 15:53:39 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v29] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 47 commits: - Resolved master conflicts - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 37 more: https://git.openjdk.org/jdk/compare/782206bc...ff39ac12 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=28 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From fbredberg at openjdk.org Tue Mar 12 15:54:39 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Tue, 12 Mar 2024 15:54:39 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v3] In-Reply-To: References: Message-ID: <_RbkJNBhsAAY1ksaQkd8WYfO5EFlIr1SSFCu7JfraKU=.1083a20d-14f8-43bb-80ef-d180881a9295@github.com> > The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: > > > [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 > [6.693s][trace][continuations] local 0 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 > [6.693s][trace][continuations] - 3 locals 3 max stack > [6.693s][trace][continuations] - codelet: return entry points > [6.693s][trace][continuations] saved fp > [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address > [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 > [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp > [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) > [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method > [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp > [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) > [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache > [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) > [6.693s][trace][continuations] 0x000000409e14cc18: 0x0000004050401a7a interpre... Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: Update after review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18102/files - new: https://git.openjdk.org/jdk/pull/18102/files/4ade221c..e48f8b1a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18102&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18102&range=01-02 Stats: 9 lines in 1 file changed: 9 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18102.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18102/head:pull/18102 PR: https://git.openjdk.org/jdk/pull/18102 From duke at openjdk.org Tue Mar 12 16:02:23 2024 From: duke at openjdk.org (Tom Shull) Date: Tue, 12 Mar 2024 16:02:23 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v13] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 18:56:07 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs. >> >> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm) >> >> This PR shows upto 19x speedup on buffer sizes of 1MB. > > Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: > > make vpmadd52l/hq generic src/hotspot/cpu/x86/vm_version_x86.cpp line 312: > 310: __ lea(rsi, Address(rbp, in_bytes(VM_Version::sef_cpuid7_ecx1_offset()))); > 311: __ movl(Address(rsi, 0), rax); > 312: __ movl(Address(rsi, 4), rbx); Hi @vamsi-parasa. I believe this code as a bug in it. Here you are copying back all four registers; however, within https://github.com/openjdk/jdk/blob/782206bc97dc6ae953b0c3ce01f8b6edab4ad30b/src/hotspot/cpu/x86/vm_version_x86.hpp#L468 you only created one field. Can you please open up a JBS issue to fix this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1521736242 From mbaesken at openjdk.org Tue Mar 12 16:05:18 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 12 Mar 2024 16:05:18 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: <3S9pyv9jOnJihfgqtK_CXhDDV62pzek2TrduKffVTnI=.4e8d8016-a299-4596-92f4-1bcf6bd206f4@github.com> References: <3S9pyv9jOnJihfgqtK_CXhDDV62pzek2TrduKffVTnI=.4e8d8016-a299-4596-92f4-1bcf6bd206f4@github.com> Message-ID: On Tue, 12 Mar 2024 15:24:21 GMT, Magnus Ihse Bursie wrote: > Please re-test. Hi Magnus, thanks for the adjustments. I reenabled your patch in our build/test queue . ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1992009593 From ihse at openjdk.org Tue Mar 12 16:20:36 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 16:20:36 GMT Subject: RFR: 8327995: Remove unused Unused_Variable Message-ID: This code in `globalDefinitions.hpp` is unused and should be removed: //---------------------------------------------------------------------------------------------------- // Utility macros for compilers // used to silence compiler warnings #define Unused_Variable(var) var ------------- Commit messages: - 8327995: Remove unused Unused_Variable Changes: https://git.openjdk.org/jdk/pull/18242/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18242&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327995 Stats: 7 lines in 1 file changed: 0 ins; 7 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18242.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18242/head:pull/18242 PR: https://git.openjdk.org/jdk/pull/18242 From kbarrett at openjdk.org Tue Mar 12 16:20:36 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 12 Mar 2024 16:20:36 GMT Subject: RFR: 8327995: Remove unused Unused_Variable In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 16:13:02 GMT, Magnus Ihse Bursie wrote: > This code in `globalDefinitions.hpp` is unused and should be removed: > > > //---------------------------------------------------------------------------------------------------- > // Utility macros for compilers > // used to silence compiler warnings > > #define Unused_Variable(var) var Looks good, and trivial. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18242#pullrequestreview-1931546798 From matsaave at openjdk.org Tue Mar 12 16:26:28 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 12 Mar 2024 16:26:28 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v7] In-Reply-To: References: Message-ID: <1r_kw1PW5eKQi3r1ZcdP7ygK9YSCFNmt-ikUcg3f_kA=.45137343-d8ee-41cc-bd98-e96bf7e9ddfd@github.com> > We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time > > This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. > > Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. > > Metrics, courtesy of @iklam : > ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 > Oopmap = 15872 bytes > > calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 > Oopmap = 10496 bytes > > (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 > More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. > > $ java -Xshare:dump -Xlog:cds+map | grep lead > [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 > [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 > > So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. > > Current output: > $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead > [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 > [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Nits ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17350/files - new: https://git.openjdk.org/jdk/pull/17350/files/bdd74ec9..c06c4d40 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=05-06 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17350.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17350/head:pull/17350 PR: https://git.openjdk.org/jdk/pull/17350 From ihse at openjdk.org Tue Mar 12 17:15:22 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 12 Mar 2024 17:15:22 GMT Subject: Integrated: 8327995: Remove unused Unused_Variable In-Reply-To: References: Message-ID: <-UVneQHETHLGR7Gvzh0AY8MucyhyFpqZISAVasG3cHc=.db36a458-09ab-4824-8bb5-41ed4a4c6c73@github.com> On Tue, 12 Mar 2024 16:13:02 GMT, Magnus Ihse Bursie wrote: > This code in `globalDefinitions.hpp` is unused and should be removed: > > > //---------------------------------------------------------------------------------------------------- > // Utility macros for compilers > // used to silence compiler warnings > > #define Unused_Variable(var) var This pull request has now been integrated. Changeset: 8a3bdd5c Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/8a3bdd5c4dc1849f5cfbdf65cc35823ff551c0b5 Stats: 7 lines in 1 file changed: 0 ins; 7 del; 0 mod 8327995: Remove unused Unused_Variable Reviewed-by: kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/18242 From kbarrett at openjdk.org Tue Mar 12 17:15:20 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 12 Mar 2024 17:15:20 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v6] In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 09:07:23 GMT, Suchismith Roy wrote: >> Remove redefinition of NULL for XLC compiler >> In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). >> Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. >> >> >> JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > update copyright Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18064#pullrequestreview-1931726118 From kbarrett at openjdk.org Tue Mar 12 17:22:20 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 12 Mar 2024 17:22:20 GMT Subject: RFR: JDK-8324682 Remove redefinition of NULL for XLC compiler [v6] In-Reply-To: References: Message-ID: <4bSVdXStKe9W3tjvmZylrXDqVCxPMRltlCL0t3x46t0=.c740ec2e-d65b-43de-a32e-0f9cadb05ddb@github.com> On Fri, 8 Mar 2024 09:07:23 GMT, Suchismith Roy wrote: >> Remove redefinition of NULL for XLC compiler >> In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). >> Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. >> >> >> JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > update copyright > /integrate For HotSpot changes, at least 2 reviewers (at least one a Reviewer) are required. The Skara tooling doesn't know about that policy for HotSpot (and some other areas that have similar policies), so will mark a PR ready even though that requirement hasn't been satisfied. Sometimes someone will use the /reviewers command to enforce the policy, but since it's a "well known" policy that often isn't bothered with. That said, since I've now also approved it, I will also sponsor. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18064#issuecomment-1992172210 From sroy at openjdk.org Tue Mar 12 17:22:20 2024 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 12 Mar 2024 17:22:20 GMT Subject: Integrated: JDK-8324682 Remove redefinition of NULL for XLC compiler In-Reply-To: References: Message-ID: On Thu, 29 Feb 2024 14:05:53 GMT, Suchismith Roy wrote: > Remove redefinition of NULL for XLC compiler > In globalDefinitions_xlc.hpp there is a redefinition of NULL to work around an issue with one of the AIX headers (). > Once all uses of NULL in HotSpot have been replaced with nullptr, there is no longer any need for this, and it can be removed. > > > JBS ISSUE : [JDK-8324682](https://bugs.openjdk.org/browse/JDK-8324682) This pull request has now been integrated. Changeset: 313e814b Author: Suchismith Roy Committer: Kim Barrett URL: https://git.openjdk.org/jdk/commit/313e814bc924c53f03052dde2ac33e74f28a82ca Stats: 18 lines in 2 files changed: 0 ins; 15 del; 3 mod 8324682: Remove redefinition of NULL for XLC compiler Reviewed-by: kbarrett, mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/18064 From duke at openjdk.org Tue Mar 12 17:34:21 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 12 Mar 2024 17:34:21 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v13] In-Reply-To: References: Message-ID: <8Awj08UkB3CpLNbWQbQOgOHUPh0PSARYgOK83JDEt0I=.0f634cfa-ec4b-4d49-a849-726fbfb64703@github.com> On Tue, 12 Mar 2024 15:59:59 GMT, Tom Shull wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> make vpmadd52l/hq generic > > src/hotspot/cpu/x86/vm_version_x86.cpp line 312: > >> 310: __ lea(rsi, Address(rbp, in_bytes(VM_Version::sef_cpuid7_ecx1_offset()))); >> 311: __ movl(Address(rsi, 0), rax); >> 312: __ movl(Address(rsi, 4), rbx); > > Hi @vamsi-parasa. I believe this code has a bug in it. Here you are copying back all four registers; however, within https://github.com/openjdk/jdk/blob/782206bc97dc6ae953b0c3ce01f8b6edab4ad30b/src/hotspot/cpu/x86/vm_version_x86.hpp#L468 you only created one field. > > Can you please open up a JBS issue to fix this? Hi Tom (@teshull), Thank you for identifying the issue. Please see the JBS issue filed at https://bugs.openjdk.org/browse/JDK-8327999. Will float a new PR to fix this issue soon. Thanks, Vamsi ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1521869268 From eirbjo at openjdk.org Tue Mar 12 17:40:18 2024 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Tue, 12 Mar 2024 17:40:18 GMT Subject: RFR: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe [v3] In-Reply-To: <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> <4Z6CqKTSgNaTYilIvNjnGudJOs8qebTs4OhP5q7xShE=.e71ecd8c-5563-4daf-bf9c-56850d69d7ec@github.com> Message-ID: On Sun, 10 Mar 2024 13:47:06 GMT, Eirik Bj?rsn?s wrote: >> Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. >> >> These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. >> >> Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. >> >> This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html >> >> Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. >> >> Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. > > Eirik Bj?rsn?s has updated the pull request incrementally with two additional commits since the last revision: > > - Replace calls to Unsafe.putObject with Unsafe.putReference > - Update a code comment referencing Unsafe.compareAndSetObject to reference Unsafe.compareAndSetReference instead Thanks for helping out with this cleanup effort, Jaikiran, Alan, Mandy, Doug and Martin! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18176#issuecomment-1992202476 From eirbjo at openjdk.org Tue Mar 12 17:40:19 2024 From: eirbjo at openjdk.org (Eirik =?UTF-8?B?QmrDuHJzbsO4cw==?=) Date: Tue, 12 Mar 2024 17:40:19 GMT Subject: Integrated: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe In-Reply-To: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> References: <_R4_060_hbsKkeKvbuME6SLIfltbo5lmrQb0dHTpY1Q=.06c07eaa-5135-4dad-b4e1-65ef7a0000bc@github.com> Message-ID: On Sun, 10 Mar 2024 05:50:33 GMT, Eirik Bj?rsn?s wrote: > Please review this PR which removes the 19 deprecated `xxObject*` alias methods from `jdk.internal.misc.Unsafe`. > > These methods were added in JDK-8213043 (JDK 12), presumably to allow `jsr166.jar` to be used across JDK versions. This was a follow-up fix after JDK-8207146 had renamed these methods to `xxReference*'. > > Since OpenJDK is now the single source of truth for `java.util.concurrent`, time has come to remove these deprecated alias methods. > > This change was initially discussed here: https://mail.openjdk.org/pipermail/core-libs-dev/2024-March/119993.html > > Testing: This is a pure deletion of deprecated methods, so the PR includes no test changes and the `noreg-cleanup` label is added in the JBS. I have verified that all `test/jdk/java/util/concurrent/*` tests pass. > > Tagging @DougLea and @Martin-Buchholz to verify that this removal is timely. This pull request has now been integrated. Changeset: 5b414662 Author: Eirik Bj?rsn?s URL: https://git.openjdk.org/jdk/commit/5b4146627580834bcd3ad0962d07d0d374fe3cce Stats: 94 lines in 4 files changed: 0 ins; 87 del; 7 mod 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe Reviewed-by: martin, alanb, mchung ------------- PR: https://git.openjdk.org/jdk/pull/18176 From ccheung at openjdk.org Tue Mar 12 17:49:15 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Tue, 12 Mar 2024 17:49:15 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v7] In-Reply-To: <1r_kw1PW5eKQi3r1ZcdP7ygK9YSCFNmt-ikUcg3f_kA=.45137343-d8ee-41cc-bd98-e96bf7e9ddfd@github.com> References: <1r_kw1PW5eKQi3r1ZcdP7ygK9YSCFNmt-ikUcg3f_kA=.45137343-d8ee-41cc-bd98-e96bf7e9ddfd@github.com> Message-ID: On Tue, 12 Mar 2024 16:26:28 GMT, Matias Saavedra Silva wrote: >> We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time >> >> This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. >> >> Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. >> >> Metrics, courtesy of @iklam : >> ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 >> Oopmap = 15872 bytes >> >> calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 >> Oopmap = 10496 bytes >> >> (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 >> More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. >> >> $ java -Xshare:dump -Xlog:cds+map | grep lead >> [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 >> [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 >> >> So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. >> >> Current output: >> $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead >> [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 >> [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Nits Looks good. I just have a minor comment below. Also, the copyright year of some files need to be updated. src/hotspot/share/cds/filemap.cpp line 1581: > 1579: > 1580: // We want to keep track of how many zeros were removed > 1581: size_t new_zeros = map->find_first_set_bit(0); I think this statement could be enclosed with `DEBUG_ONLY` since the `new_zeros` is only used in the `assert `statement following it. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17350#pullrequestreview-1931842261 PR Review Comment: https://git.openjdk.org/jdk/pull/17350#discussion_r1521889703 From duke at openjdk.org Tue Mar 12 19:15:21 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 12 Mar 2024 19:15:21 GMT Subject: RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v13] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:59:59 GMT, Tom Shull wrote: >> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision: >> >> make vpmadd52l/hq generic > > src/hotspot/cpu/x86/vm_version_x86.cpp line 312: > >> 310: __ lea(rsi, Address(rbp, in_bytes(VM_Version::sef_cpuid7_ecx1_offset()))); >> 311: __ movl(Address(rsi, 0), rax); >> 312: __ movl(Address(rsi, 4), rbx); > > Hi @vamsi-parasa. I believe this code has a bug in it. Here you are copying back all four registers; however, within https://github.com/openjdk/jdk/blob/782206bc97dc6ae953b0c3ce01f8b6edab4ad30b/src/hotspot/cpu/x86/vm_version_x86.hpp#L468 you only created one field. > > Can you please open up a JBS issue to fix this? Hi Tom (@teshull), pls see the PR to fix this issue: https://github.com/openjdk/jdk/pull/18248 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1521996254 From duke at openjdk.org Tue Mar 12 19:16:22 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Tue, 12 Mar 2024 19:16:22 GMT Subject: RFR: 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation Message-ID: The goal of this PR is to fix the bug reported in #17881 where data from unused registers (rbx, rcx, rdx) is being copied. ------------- Commit messages: - 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation Changes: https://git.openjdk.org/jdk/pull/18248/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18248&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327999 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18248.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18248/head:pull/18248 PR: https://git.openjdk.org/jdk/pull/18248 From matsaave at openjdk.org Tue Mar 12 19:29:24 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 12 Mar 2024 19:29:24 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v8] In-Reply-To: References: Message-ID: > We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time > > This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. > > Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. > > Metrics, courtesy of @iklam : > ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 > Oopmap = 15872 bytes > > calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 > Oopmap = 10496 bytes > > (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 > More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. > > $ java -Xshare:dump -Xlog:cds+map | grep lead > [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 > [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 > > So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. > > Current output: > $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead > [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 > [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Fixed copyright headers and made assert debug only ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17350/files - new: https://git.openjdk.org/jdk/pull/17350/files/c06c4d40..c45d58ce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17350&range=06-07 Stats: 8 lines in 4 files changed: 2 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17350.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17350/head:pull/17350 PR: https://git.openjdk.org/jdk/pull/17350 From vitaly.provodin at jetbrains.com Tue Mar 12 22:44:06 2024 From: vitaly.provodin at jetbrains.com (Vitaly Provodin) Date: Wed, 13 Mar 2024 05:44:06 +0700 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> Message-ID: Hi Stefan, Maxim, With the patch the version string is successfully printed - VM was not killed. Thanks, Vitaly > On 12. Mar 2024, at 17:19, Stefan Karlsson wrote: > > Hi again, > > I just want to clarify that the change above just removes *one* usage of SafeFetch. My guess is that all other usages of SafeFetch is still affected by the issue you are seeing. > > It would be interesting to see if you could verify this problem by applying this patch: > > diff --git a/src/hotspot/share/memory/universe.cpp b/src/hotspot/share/memory/universe.cpp > index 2a3d532725f..b58c52e415b 100644 > --- a/src/hotspot/share/memory/universe.cpp > +++ b/src/hotspot/share/memory/universe.cpp > @@ -842,7 +842,14 @@ jint universe_init() { > return JNI_OK; > } > > +#include "runtime/safefetch.hpp" > + > jint Universe::initialize_heap() { > + char* mem = os::reserve_memory(16 * K, false, mtGC); > + intptr_t value = SafeFetchN((intptr_t*)mem, 1); > + > + log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, p2i(mem), value); > + > assert(_collectedHeap == nullptr, "Heap already created"); > _collectedHeap = GCConfig::arguments()->create_heap(); > > > and running: > build//jdk/bin/java -version > > This works on 14.3.1, but it would be interesting to see what happens on 14.4, and if that shuts down the VM before it is printing the version string. > > StefanK > > On 2024-03-12 09:34, Stefan Karlsson wrote: >> Hi Maxim, >> >> On 2024-03-12 08:12, Maxim Kartashev wrote: >>> Hello! >>> >>> This has been recently filed as https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to check with the community if any further info on the issue is available. >>> >>> It looks like in some cases SafeFetch is directed to fetch something the OS really doesn't want it to and, instead of promoting the error to a signal and letting the application (JVM) deal with it, immediately terminates the application. All we've got is the OS crash report, not even the JVM's fatal error log. This looks like an application security precaution, which I am not at all familiar with. >>> >>> The relevant pieces of the crash log are below. Is anybody familiar with "Namespace GUARD" termination reason and maybe other related novelties of macOS 14.4? The error was not reported before upgrading to 14.4 >> >> I don't have an answer for this, but a note below: >> >>> >>> Thanks in advance, >>> >>> Maxim. >>> >>> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 >>> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 >>> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 >>> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, BasicLock*, JavaThread*) + 392 >> >> FYI: I think that this specific call to SafeFetch was recently removed by: >> https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3 >> https://bugs.openjdk.org/browse/JDK-8320317 >> >> Cheers, >> StefanK >> >> >>> ... >>> >>> Exception Type: EXC_BAD_ACCESS (SIGKILL) >>> Exception Codes: KERN_PROTECTION_FAILURE at 0x00000001004f4000 >>> Exception Codes: 0x0000000000000002, 0x00000001004f4000 >>> >>> Termination Reason: Namespace GUARD, Code 5 >>> >>> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; bytes after start: 0 bytes before end: 16383 >>> REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL >>> mapped file 1004e4000-1004f4000 [ 64K] r--/r-- SM=COW Object_id=fa8d88e7 >>> ---> VM_ALLOCATE 1004f4000-1004f8000 [ 16K] ---/rwx SM=NUL >>> VM_ALLOCATE 1004f8000-1004fc000 [ 16K] r--/rwx SM=PRV >> > From maxim.kartashev at jetbrains.com Wed Mar 13 05:39:01 2024 From: maxim.kartashev at jetbrains.com (Maxim Kartashev) Date: Wed, 13 Mar 2024 09:39:01 +0400 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> Message-ID: Thank you! Then it seems like only some regions are so protected. Perhaps, those that macOS adds by itself to guard against buffer overruns and such?.. On Wed, Mar 13, 2024 at 2:44?AM Vitaly Provodin < vitaly.provodin at jetbrains.com> wrote: > Hi Stefan, Maxim, > > With the patch the version string is successfully printed - VM was not > killed. > > Thanks, > Vitaly > > > > On 12. Mar 2024, at 17:19, Stefan Karlsson > wrote: > > > > Hi again, > > > > I just want to clarify that the change above just removes *one* usage of > SafeFetch. My guess is that all other usages of SafeFetch is still affected > by the issue you are seeing. > > > > It would be interesting to see if you could verify this problem by > applying this patch: > > > > diff --git a/src/hotspot/share/memory/universe.cpp > b/src/hotspot/share/memory/universe.cpp > > index 2a3d532725f..b58c52e415b 100644 > > --- a/src/hotspot/share/memory/universe.cpp > > +++ b/src/hotspot/share/memory/universe.cpp > > @@ -842,7 +842,14 @@ jint universe_init() { > > return JNI_OK; > > } > > > > +#include "runtime/safefetch.hpp" > > + > > jint Universe::initialize_heap() { > > + char* mem = os::reserve_memory(16 * K, false, mtGC); > > + intptr_t value = SafeFetchN((intptr_t*)mem, 1); > > + > > + log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, > p2i(mem), value); > > + > > assert(_collectedHeap == nullptr, "Heap already created"); > > _collectedHeap = GCConfig::arguments()->create_heap(); > > > > > > and running: > > build//jdk/bin/java -version > > > > This works on 14.3.1, but it would be interesting to see what happens on > 14.4, and if that shuts down the VM before it is printing the version > string. > > > > StefanK > > > > On 2024-03-12 09:34, Stefan Karlsson wrote: > >> Hi Maxim, > >> > >> On 2024-03-12 08:12, Maxim Kartashev wrote: > >>> Hello! > >>> > >>> This has been recently filed as > https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to check > with the community if any further info on the issue is available. > >>> > >>> It looks like in some cases SafeFetch is directed to fetch something > the OS really doesn't want it to and, instead of promoting the error to a > signal and letting the application (JVM) deal with it, immediately > terminates the application. All we've got is the OS crash report, not even > the JVM's fatal error log. This looks like an application security > precaution, which I am not at all familiar with. > >>> > >>> The relevant pieces of the crash log are below. Is anybody familiar > with "Namespace GUARD" termination reason and maybe other related novelties > of macOS 14.4? The error was not reported before upgrading to 14.4 > >> > >> I don't have an answer for this, but a note below: > >> > >>> > >>> Thanks in advance, > >>> > >>> Maxim. > >>> > >>> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 > >>> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 > >>> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 > >>> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, > BasicLock*, JavaThread*) + 392 > >> > >> FYI: I think that this specific call to SafeFetch was recently removed > by: > >> > https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3 > >> https://bugs.openjdk.org/browse/JDK-8320317 > >> > >> Cheers, > >> StefanK > >> > >> > >>> ... > >>> > >>> Exception Type: EXC_BAD_ACCESS (SIGKILL) > >>> Exception Codes: KERN_PROTECTION_FAILURE at 0x00000001004f4000 > >>> Exception Codes: 0x0000000000000002, 0x00000001004f4000 > >>> > >>> Termination Reason: Namespace GUARD, Code 5 > >>> > >>> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; bytes > after start: 0 bytes before end: 16383 > >>> REGION TYPE START - END [ VSIZE] > PRT/MAX SHRMOD REGION DETAIL > >>> mapped file 1004e4000-1004f4000 [ 64K] > r--/r-- SM=COW Object_id=fa8d88e7 > >>> ---> VM_ALLOCATE 1004f4000-1004f8000 [ 16K] > ---/rwx SM=NUL > >>> VM_ALLOCATE 1004f8000-1004fc000 [ 16K] > r--/rwx SM=PRV > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Wed Mar 13 06:27:14 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 13 Mar 2024 07:27:14 +0100 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> Message-ID: <5e4764c3-6675-4490-aa8f-9a5141f10277@oracle.com> Hi Vitaly, On 2024-03-12 23:44, Vitaly Provodin wrote: > Hi Stefan, Maxim, > > With the patch the version string is successfully printed - VM was not killed. Did you run this on 14.4? Did you also run this on an AArch64 machine? FWIW, I updated a machine to 14.4 and could reproduce the reported problem with a patch similar to the one below. I also managed to write a stand-alone C reproducer, which I have added to the bug report. Thanks, StefanK > > Thanks, > Vitaly > > >> On 12. Mar 2024, at 17:19, Stefan Karlsson wrote: >> >> Hi again, >> >> I just want to clarify that the change above just removes *one* usage of SafeFetch. My guess is that all other usages of SafeFetch is still affected by the issue you are seeing. >> >> It would be interesting to see if you could verify this problem by applying this patch: >> >> diff --git a/src/hotspot/share/memory/universe.cpp b/src/hotspot/share/memory/universe.cpp >> index 2a3d532725f..b58c52e415b 100644 >> --- a/src/hotspot/share/memory/universe.cpp >> +++ b/src/hotspot/share/memory/universe.cpp >> @@ -842,7 +842,14 @@ jint universe_init() { >> return JNI_OK; >> } >> >> +#include "runtime/safefetch.hpp" >> + >> jint Universe::initialize_heap() { >> + char* mem = os::reserve_memory(16 * K, false, mtGC); >> + intptr_t value = SafeFetchN((intptr_t*)mem, 1); >> + >> + log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, p2i(mem), value); >> + >> assert(_collectedHeap == nullptr, "Heap already created"); >> _collectedHeap = GCConfig::arguments()->create_heap(); >> >> >> and running: >> build//jdk/bin/java -version >> >> This works on 14.3.1, but it would be interesting to see what happens on 14.4, and if that shuts down the VM before it is printing the version string. >> >> StefanK >> >> On 2024-03-12 09:34, Stefan Karlsson wrote: >>> Hi Maxim, >>> >>> On 2024-03-12 08:12, Maxim Kartashev wrote: >>>> Hello! >>>> >>>> This has been recently filed as https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to check with the community if any further info on the issue is available. >>>> >>>> It looks like in some cases SafeFetch is directed to fetch something the OS really doesn't want it to and, instead of promoting the error to a signal and letting the application (JVM) deal with it, immediately terminates the application. All we've got is the OS crash report, not even the JVM's fatal error log. This looks like an application security precaution, which I am not at all familiar with. >>>> >>>> The relevant pieces of the crash log are below. Is anybody familiar with "Namespace GUARD" termination reason and maybe other related novelties of macOS 14.4? The error was not reported before upgrading to 14.4 >>> I don't have an answer for this, but a note below: >>> >>>> Thanks in advance, >>>> >>>> Maxim. >>>> >>>> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 >>>> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 >>>> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 >>>> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, BasicLock*, JavaThread*) + 392 >>> FYI: I think that this specific call to SafeFetch was recently removed by: >>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3__;!!ACWV5N9M2RV99hQ!MxH9PbSoMKT5I4e02Kkz2Sk1O3hStNxJuKBvsvK5GLi_k_rusbuYduHFzMUqPmzB5FZgTBpgdghtEhMW2QRCYe45o9VfzGctxA$ >>> https://bugs.openjdk.org/browse/JDK-8320317 >>> >>> Cheers, >>> StefanK >>> >>> >>>> ... >>>> >>>> Exception Type: EXC_BAD_ACCESS (SIGKILL) >>>> Exception Codes: KERN_PROTECTION_FAILURE at 0x00000001004f4000 >>>> Exception Codes: 0x0000000000000002, 0x00000001004f4000 >>>> >>>> Termination Reason: Namespace GUARD, Code 5 >>>> >>>> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; bytes after start: 0 bytes before end: 16383 >>>> REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL >>>> mapped file 1004e4000-1004f4000 [ 64K] r--/r-- SM=COW Object_id=fa8d88e7 >>>> ---> VM_ALLOCATE 1004f4000-1004f8000 [ 16K] ---/rwx SM=NUL >>>> VM_ALLOCATE 1004f8000-1004fc000 [ 16K] r--/rwx SM=PRV From dholmes at openjdk.org Wed Mar 13 07:28:17 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 13 Mar 2024 07:28:17 GMT Subject: RFR: 8327460: Compile tests with the same visibility rules as product code [v2] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 13:55:11 GMT, Magnus Ihse Bursie wrote: >> test/hotspot/jtreg/runtime/ErrorHandling/libTestDwarfHelper.h line 24: >> >>> 22: */ >>> 23: >>> 24: #include >> >> Seems unneeded. > > So what I did here was change: > > #include "jni.h" > #include > > > into > > #include > > #include "export.h" > #include "jni.h" > > > The reordering was strictly not needed, but putting user includes in front of system ones looked like bad coding to me, and put me in a difficult spot in where to add the new `#include "export.h"` -- next to the current user include even if I thought that was wrong, or have the system includes "sandwitched" between two user includes. > > Or do you mean that it seems unneeded to include `jni.h` at all? Yes, I agree, but it was there before, and I don't want to make any other changes to the tests. This change is scary enough as it is :-). If you want, I can file a follow-up to remove this include instead. Yes I meant the include is actually unnecessary, but fine to not make that change here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18135#discussion_r1522659393 From dcherepanov at openjdk.org Wed Mar 13 07:38:18 2024 From: dcherepanov at openjdk.org (Dmitry Cherepanov) Date: Wed, 13 Mar 2024 07:38:18 GMT Subject: RFR: 8327885: runtime/Unsafe/InternalErrorTest.java enters endless loop on Alpine aarch64 Message-ID: [JDK-8322163](https://bugs.openjdk.org/browse/JDK-8322163) replaced memset with a for loop on Alpine. This fixed the test on Alpine x86_64 but it enters endless loop on Alpine aarch64. The loop causes SIGBUS to be generated and the signal handler continues to the next instruction. As gcc generates strb with auto-increment on aarch64, the increment will be skipped. The patch makes the counter volatile to prevent compilers from generating strb with auto-increment. With the patch, the test passes on Alpine aarch64. ------------- Commit messages: - 8327885: runtime/Unsafe/InternalErrorTest.java enters endless loop on Alpine aarch64 Changes: https://git.openjdk.org/jdk/pull/18262/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18262&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327885 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18262.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18262/head:pull/18262 PR: https://git.openjdk.org/jdk/pull/18262 From sspitsyn at openjdk.org Wed Mar 13 07:46:18 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 13 Mar 2024 07:46:18 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v29] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:53:39 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 47 commits: > > - Resolved master conflicts > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - ... and 37 more: https://git.openjdk.org/jdk/compare/782206bc...ff39ac12 src/hotspot/share/ci/ciEnv.cpp line 1144: > 1142: > 1143: if (entry_bci == InvocationEntryBci) { > 1144: if (TieredCompilation) { Just a naive question. Why this check has been removed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1522682325 From sspitsyn at openjdk.org Wed Mar 13 07:51:18 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 13 Mar 2024 07:51:18 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v29] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:53:39 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 47 commits: > > - Resolved master conflicts > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - ... and 37 more: https://git.openjdk.org/jdk/compare/782206bc...ff39ac12 src/hotspot/share/services/diagnosticCommand.cpp line 928: > 926: DCmdWithParser(output, heap), > 927: _filename("filename", "Name of the directives file", "STRING", true), > 928: _refresh("-r", "Refresh affected methods.", "BOOLEAN", false, "false") { Nit: The dot is not needed at the end, I think. The same applies to lines: 945, 970 and 987. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1522688369 From ihse at openjdk.org Wed Mar 13 08:12:19 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 13 Mar 2024 08:12:19 GMT Subject: Integrated: 8327460: Compile tests with the same visibility rules as product code In-Reply-To: References: Message-ID: On Wed, 6 Mar 2024 11:33:30 GMT, Magnus Ihse Bursie wrote: > Currently, our symbol visibility handling for tests are sloppy; we only handle it properly on Windows. We need to bring it up to the same levels as product code. This is a prerequisite for [JDK-8327045](https://bugs.openjdk.org/browse/JDK-8327045), which in turn is a building block for Hermetic Java. This pull request has now been integrated. Changeset: cc9a8aba Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/cc9a8aba67f4e240c8de2d1ae15d1b80bfa446a0 Stats: 304 lines in 39 files changed: 69 ins; 159 del; 76 mod 8327460: Compile tests with the same visibility rules as product code Reviewed-by: erikj, jvernee, dholmes, alanb ------------- PR: https://git.openjdk.org/jdk/pull/18135 From mbaesken at openjdk.org Wed Mar 13 08:30:16 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 13 Mar 2024 08:30:16 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v2] In-Reply-To: References: <3S9pyv9jOnJihfgqtK_CXhDDV62pzek2TrduKffVTnI=.4e8d8016-a299-4596-92f4-1bcf6bd206f4@github.com> Message-ID: On Tue, 12 Mar 2024 16:02:41 GMT, Matthias Baesken wrote: > > Please re-test. > > Hi Magnus, thanks for the adjustments. I reenabled your patch in our build/test queue . Builds and test results on AIX (product and fastdebug) are fine with the latest version of the PR . ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1993808324 From ihse at openjdk.org Wed Mar 13 08:45:20 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 13 Mar 2024 08:45:20 GMT Subject: Integrated: 8327701: Remove the xlc toolchain In-Reply-To: References: Message-ID: On Fri, 8 Mar 2024 15:29:58 GMT, Magnus Ihse Bursie wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. This pull request has now been integrated. Changeset: 107cb536 Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/107cb536e75509ad63b245d20772eb2c3f73d595 Stats: 327 lines in 19 files changed: 24 ins; 266 del; 37 mod 8327701: Remove the xlc toolchain Reviewed-by: jwaters, erikj ------------- PR: https://git.openjdk.org/jdk/pull/18172 From mbaesken at openjdk.org Wed Mar 13 08:53:23 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 13 Mar 2024 08:53:23 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v4] In-Reply-To: References: Message-ID: <-uA04de5qrhwkzJkst47cUY1ZdLaHDPn3KIclLd4b84=.e15c4d6c-2f79-4ab0-a1a0-8cc0dee581c8@github.com> On Tue, 12 Mar 2024 15:27:29 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with two additional commits since the last revision: > > - Replace CC_VERSION_OUTPUT with CC_VERSION_STRING > - We need CC_VERSION_OUTPUT before we can check it Seems `HOTSPOT_TOOLCHAIN_TYPE=xlc ` and `TARGET_COMPILER_xlc` macros stay, is this intended ? (not saying this is a wrong thing to do, but I believed first that all the xlc toolchain references are replaced by clang?) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1993844670 From ihse at openjdk.org Wed Mar 13 09:03:23 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 13 Mar 2024 09:03:23 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v4] In-Reply-To: <-uA04de5qrhwkzJkst47cUY1ZdLaHDPn3KIclLd4b84=.e15c4d6c-2f79-4ab0-a1a0-8cc0dee581c8@github.com> References: <-uA04de5qrhwkzJkst47cUY1ZdLaHDPn3KIclLd4b84=.e15c4d6c-2f79-4ab0-a1a0-8cc0dee581c8@github.com> Message-ID: On Wed, 13 Mar 2024 08:50:10 GMT, Matthias Baesken wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with two additional commits since the last revision: >> >> - Replace CC_VERSION_OUTPUT with CC_VERSION_STRING >> - We need CC_VERSION_OUTPUT before we can check it > > Seems `HOTSPOT_TOOLCHAIN_TYPE=xlc ` and `TARGET_COMPILER_xlc` macros stay, is this intended ? > (not saying this is a wrong thing to do, but I believed first that all the xlc toolchain references are replaced by clang?) @MBaesken Yes, I discussed this in length above: https://github.com/openjdk/jdk/pull/18172#issuecomment-1985939418. In short, changing that would mean changing behavior, and that is not something I want to do in a removal refactoring. Also, it is up to you guys to make it work correctly. But I really believe you should address this. Let me know if you need help with the build aspects. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1993862922 From duke at openjdk.org Wed Mar 13 09:21:16 2024 From: duke at openjdk.org (Tom Shull) Date: Wed, 13 Mar 2024 09:21:16 GMT Subject: RFR: 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 19:11:28 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to fix the bug reported in #17881 where data from unused registers (rbx, rcx, rdx) is being copied. Marked as reviewed by teshull at github.com (no known OpenJDK username). ------------- PR Review: https://git.openjdk.org/jdk/pull/18248#pullrequestreview-1933541513 From duke at openjdk.org Wed Mar 13 09:32:20 2024 From: duke at openjdk.org (Yude Lin) Date: Wed, 13 Mar 2024 09:32:20 GMT Subject: RFR: 8328064: Remove obsolete comments in constantPool and metadataFactory Message-ID: Clean up some obsolete comments: After 8243287, there is no longer the concept of "pseudo-string"; After 8072061, the "read_only" argument that the comment is referring to is removed. Cheers ------------- Commit messages: - 8328064: Remove obsolete comments in constantPool and metadataFactory Changes: https://git.openjdk.org/jdk/pull/18266/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18266&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328064 Stats: 3 lines in 2 files changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18266.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18266/head:pull/18266 PR: https://git.openjdk.org/jdk/pull/18266 From rrich at openjdk.org Wed Mar 13 09:47:20 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 13 Mar 2024 09:47:20 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite Message-ID: This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. Testing: make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync More tests are pending. ------------- Commit messages: - Switch to WXWrite before entering the vm Changes: https://git.openjdk.org/jdk/pull/18238/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327990 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18238.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18238/head:pull/18238 PR: https://git.openjdk.org/jdk/pull/18238 From dholmes at openjdk.org Wed Mar 13 09:52:15 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 13 Mar 2024 09:52:15 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:05:00 GMT, Richard Reingruber wrote: > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. As I wrote in JBS, shouldn't this be handled by `ThreadInVMfromNative`? ------------- PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1933634023 From fbredberg at openjdk.org Wed Mar 13 09:52:17 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 13 Mar 2024 09:52:17 GMT Subject: RFR: 8314508: Improve how relativized pointers are printed by frame::describe [v3] In-Reply-To: <_RbkJNBhsAAY1ksaQkd8WYfO5EFlIr1SSFCu7JfraKU=.1083a20d-14f8-43bb-80ef-d180881a9295@github.com> References: <_RbkJNBhsAAY1ksaQkd8WYfO5EFlIr1SSFCu7JfraKU=.1083a20d-14f8-43bb-80ef-d180881a9295@github.com> Message-ID: On Tue, 12 Mar 2024 15:54:39 GMT, Fredrik Bredberg wrote: >> The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: >> >> >> [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 >> [6.693s][trace][continuations] local 0 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 >> [6.693s][trace][continuations] - 3 locals 3 max stack >> [6.693s][trace][continuations] - codelet: return entry points >> [6.693s][trace][continuations] saved fp >> [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address >> [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 >> [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp >> [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) >> [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method >> [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp >> [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) >> [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror >> [6.693s][trace][continuations] oop for #10 >> [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache >> [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) >> [6.693s][trace][continuatio... > > Fredrik Bredberg has updated the pull request incrementally with one additional commit since the last revision: > > Update after review Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18102#issuecomment-1993979180 From jkern at openjdk.org Wed Mar 13 09:53:22 2024 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 13 Mar 2024 09:53:22 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v4] In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:27:29 GMT, Magnus Ihse Bursie wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain is no longer supported, and should be removed. > > Magnus Ihse Bursie has updated the pull request incrementally with two additional commits since the last revision: > > - Replace CC_VERSION_OUTPUT with CC_VERSION_STRING > - We need CC_VERSION_OUTPUT before we can check it e.g. We should change the HOTSPOT_TOOLCHAIN_TYPE=xlc to aix, because it is not toolchain, but OS related. As a consequence the globalDefinitions_xlc.hpp will become globalDefinitions_aix.hpp ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1993979991 From rrich at openjdk.org Wed Mar 13 10:02:13 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 13 Mar 2024 10:02:13 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 09:49:19 GMT, David Holmes wrote: > As I wrote in JBS, shouldn't this be handled by `ThreadInVMfromNative`? (I wanted to publish the PR before answering your comment) This would be reasonable in my opinion. I've hoisted setting `WXWrite` mode in `JfrJvmtiAgent::retransform_classes()` because multiple instances of `ThreadInVMfromNative` are reached. This is likely not even necessary. Still exceptions could be made if there are usages of `ThreadInVMfromNative` where it is needed. While I agree I'd prefer to do it as a separate enhancement. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1994000304 From ihse at openjdk.org Wed Mar 13 10:28:25 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 13 Mar 2024 10:28:25 GMT Subject: RFR: 8327701: Remove the xlc toolchain [v4] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 09:50:20 GMT, Joachim Kern wrote: > e.g. We should change the HOTSPOT_TOOLCHAIN_TYPE=xlc to aix, because it is not toolchain, but OS related. As a consequence the globalDefinitions_xlc.hpp will become globalDefinitions_aix.hpp No, it's not that simple. First, the `globalDefinitions_` files are included on a per-toolchain basis, not OS basis. Secondly, I think you will want to have the stuff in globalDefinitions_gcc.h. In fact, if you compare globalDefinitions_gcc.h and globalDefinitions_xlc.h side-by-side, you see that they are much more similar than you'd perhaps naively expect. The remaining differences will probably be better expressed as #ifdef AIX inside globalDefinitions_gcc.h, I assume. (But of course, in the end this is up to the hotspot team.) I recommend doing such a side-by-side check yourself to get an understanding of the actual differences. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18172#issuecomment-1994049434 From dholmes at openjdk.org Wed Mar 13 12:22:15 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 13 Mar 2024 12:22:15 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:05:00 GMT, Richard Reingruber wrote: > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp line 160: > 158: ResourceMark rm(THREAD); > 159: // WXWrite is needed before entering the vm below and in callee methods. > 160: MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, THREAD)); I understand you placed this here to cover the transition inside `create_classes_array` and the immediate one at line 170, but doesn't this risk having the wrong WX state for code in between? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1523127288 From gli at openjdk.org Wed Mar 13 12:53:21 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 13 Mar 2024 12:53:21 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` Message-ID: Hi all, This trivial patch removes unnecessary return statement in method `markWord::displaced_mark_helper`. Thanks for taking the time to review. Best Regards, -- Guoxiong ------------- Commit messages: - Remove unnecessary code. Changes: https://git.openjdk.org/jdk/pull/18272/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18272&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328081 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18272.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18272/head:pull/18272 PR: https://git.openjdk.org/jdk/pull/18272 From gli at openjdk.org Wed Mar 13 13:39:46 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 13 Mar 2024 13:39:46 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: References: Message-ID: > Hi all, > > This trivial patch removes unnecessary return statement in method `markWord::displaced_mark_helper`. > > Thanks for taking the time to review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into REMOVE_UNNECESSARY_CODE - Remove unnecessary code. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18272/files - new: https://git.openjdk.org/jdk/pull/18272/files/1fa89aa4..01ca4b01 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18272&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18272&range=00-01 Stats: 9049 lines in 201 files changed: 5387 ins; 2872 del; 790 mod Patch: https://git.openjdk.org/jdk/pull/18272.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18272/head:pull/18272 PR: https://git.openjdk.org/jdk/pull/18272 From rrich at openjdk.org Wed Mar 13 13:56:14 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 13 Mar 2024 13:56:14 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 12:19:34 GMT, David Holmes wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp line 160: > >> 158: ResourceMark rm(THREAD); >> 159: // WXWrite is needed before entering the vm below and in callee methods. >> 160: MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, THREAD)); > > I understand you placed this here to cover the transition inside `create_classes_array` and the immediate one at line 170, but doesn't this risk having the wrong WX state for code in between? I've asked this myself (after making the change). Being in `WXWrite` mode would be wrong if the thread would execute dynamically generated code. There's not too much happening outside the scope of the `ThreadInVMfromNative` instances. I see jni calls (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call (`RetransformClasses`) but these are safe because the callees enter the vm right away. We even avoid switching to `WXWrite` and back there. So I thought it would be ok to coarsen the WXMode switching. But maybe it's still better to avoid any risk especially since there's likely no performance effect. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1523305040 From matsaave at openjdk.org Wed Mar 13 14:03:19 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 13 Mar 2024 14:03:19 GMT Subject: RFR: 8251330: Reorder CDS archived heap to speed up relocation [v7] In-Reply-To: References: <1r_kw1PW5eKQi3r1ZcdP7ygK9YSCFNmt-ikUcg3f_kA=.45137343-d8ee-41cc-bd98-e96bf7e9ddfd@github.com> Message-ID: On Tue, 12 Mar 2024 17:46:21 GMT, Calvin Cheung wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Nits > > Looks good. I just have a minor comment below. > Also, the copyright year of some files need to be updated. Thanks for the reviews @calvinccheung and @iklam! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17350#issuecomment-1994475836 From matsaave at openjdk.org Wed Mar 13 14:03:20 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 13 Mar 2024 14:03:20 GMT Subject: Integrated: 8251330: Reorder CDS archived heap to speed up relocation In-Reply-To: References: Message-ID: On Wed, 10 Jan 2024 17:19:39 GMT, Matias Saavedra Silva wrote: > We should reorder the archived heap to segregate the objects that don't need marking. This will save space in the archive and improve start-up time > > This patch reorders the archived heap to segregate the objects that don't need marking. The leading zeros present in the bitmaps thanks to the reordering can be easily removed, and this the size of the archive is reduced. > > Given that the bitmaps are word aligned, some leading zeroes will remain. Verified with tier 1-5 tests. > > Metrics, courtesy of @iklam : > ```calculate_oopmap: objects = 15262 (507904 bytes, 332752 bytes don't need marking), embedded oops = 8408, nulls = 54 > Oopmap = 15872 bytes > > calculate_oopmap: objects = 4590 (335872 bytes, 178120 bytes don't need marking), embedded oops = 46487, nulls = 29019 > Oopmap = 10496 bytes > > (332752 + 178120) / (507904 + 335872.0) = 0.6054592688106796 > More than 60% of the space used by the archived heap doesn't need to be marked by the oopmap. > > $ java -Xshare:dump -Xlog:cds+map | grep lead > [3.777s][info][cds,map] - heap_oopmap_leading_zeros: 143286 > [3.777s][info][cds,map] - heap_ptrmap_leading_zeros: 50713 > > So we can reduce the "bm" region by (143286 + 50713) / 8 = 24249 bytes. > > Current output: > $ java -XX:+UseNewCode -Xshare:dump -Xlog:cds+map | grep lead > [5.339s][info][cds,map] - heap_oopmap_leading_zeros: 26 > [5.339s][info][cds,map] - heap_ptrmap_leading_zeros: 8 This pull request has now been integrated. Changeset: 7e05a703 Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/7e05a70301796288cb3bcc6be8fb619b6ce600bc Stats: 212 lines in 9 files changed: 178 ins; 8 del; 26 mod 8251330: Reorder CDS archived heap to speed up relocation Reviewed-by: iklam, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/17350 From fbredberg at openjdk.org Wed Mar 13 15:18:19 2024 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Wed, 13 Mar 2024 15:18:19 GMT Subject: Integrated: 8314508: Improve how relativized pointers are printed by frame::describe In-Reply-To: References: Message-ID: On Mon, 4 Mar 2024 09:38:26 GMT, Fredrik Bredberg wrote: > The output from frame::describe has been improved so that it now prints the actual derelativized pointer value, regardless if it's on the stack or on the heap. It also clearly shows which members of the fixed stackframe are relativized. See sample output of a riscv64 stackframe below: > > > [6.693s][trace][continuations] 0x000000409e14cc88: 0x00000000eeceaf58 locals for #10 > [6.693s][trace][continuations] local 0 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc80: 0x00000000eec34548 local 1 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc78: 0x00000000eeceac58 local 2 > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc70: 0x00000000eee03eb0 #10 method java.lang.invoke.LambdaForm$DMH/0x0000000050001000.invokeVirtual(Ljava/lang/Object;Ljava/lang/Object;)V @ 10 > [6.693s][trace][continuations] - 3 locals 3 max stack > [6.693s][trace][continuations] - codelet: return entry points > [6.693s][trace][continuations] saved fp > [6.693s][trace][continuations] 0x000000409e14cc68: 0x00000040137dc790 return address > [6.693s][trace][continuations] 0x000000409e14cc60: 0x000000409e14ccf0 > [6.693s][trace][continuations] 0x000000409e14cc58: 0x000000409e14cc60 interpreter_frame_sender_sp > [6.693s][trace][continuations] 0x000000409e14cc50: 0x000000409e14cc00 interpreter_frame_last_sp (relativized: fp-14) > [6.693s][trace][continuations] 0x000000409e14cc48: 0x0000004050401a88 interpreter_frame_method > [6.693s][trace][continuations] 0x000000409e14cc40: 0x0000000000000000 interpreter_frame_mdp > [6.693s][trace][continuations] 0x000000409e14cc38: 0x000000409e14cbe0 interpreter_frame_extended_sp (relativized: fp-18) > [6.693s][trace][continuations] 0x000000409e14cc30: 0x00000000ef093110 interpreter_frame_mirror > [6.693s][trace][continuations] oop for #10 > [6.693s][trace][continuations] 0x000000409e14cc28: 0x0000004050401c68 interpreter_frame_cache > [6.693s][trace][continuations] 0x000000409e14cc20: 0x000000409e14cc88 interpreter_frame_locals (relativized: fp+3) > [6.693s][trace][continuations] 0x000000409e14cc18: 0x0000004050401a7a interpre... This pull request has now been integrated. Changeset: 0db62311 Author: Fredrik Bredberg Committer: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/0db62311980cd045e5a9e2c030b653aacf104825 Stats: 17 lines in 1 file changed: 15 ins; 0 del; 2 mod 8314508: Improve how relativized pointers are printed by frame::describe Reviewed-by: coleenp, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/18102 From ayang at openjdk.org Wed Mar 13 15:26:20 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 13 Mar 2024 15:26:20 GMT Subject: RFR: 8328105: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev Message-ID: Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. Test: tier1-3 ------------- Commit messages: - pgc-dead-ratio Changes: https://git.openjdk.org/jdk/pull/18278/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18278&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328105 Stats: 244 lines in 7 files changed: 8 ins; 219 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18278.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18278/head:pull/18278 PR: https://git.openjdk.org/jdk/pull/18278 From shade at openjdk.org Wed Mar 13 15:32:15 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 13 Mar 2024 15:32:15 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: References: Message-ID: <6MZWOlzI7RAK7wCkH71SDVVSphECvy947ymzmGB4u8I=.26d98137-3948-4cd2-b4d2-4043ada6a077@github.com> On Wed, 13 Mar 2024 13:39:46 GMT, Guoxiong Li wrote: >> Hi all, >> >> This trivial patch removes unnecessary return statement in method `markWord::displaced_mark_helper`. >> >> Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into REMOVE_UNNECESSARY_CODE > - Remove unnecessary code. I think we do this as the matter of course, in case `fatal` is not handled as non-returning by compilers. Does this fix any existing compiler warning/failure? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18272#issuecomment-1994668646 From gli at openjdk.org Wed Mar 13 15:46:15 2024 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 13 Mar 2024 15:46:15 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: <6MZWOlzI7RAK7wCkH71SDVVSphECvy947ymzmGB4u8I=.26d98137-3948-4cd2-b4d2-4043ada6a077@github.com> References: <6MZWOlzI7RAK7wCkH71SDVVSphECvy947ymzmGB4u8I=.26d98137-3948-4cd2-b4d2-4043ada6a077@github.com> Message-ID: On Wed, 13 Mar 2024 15:29:23 GMT, Aleksey Shipilev wrote: > I think we do this as the matter of course, in case `fatal` is not handled as non-returning by compilers. The method `markWord::set_displaced_mark_helper` has no return statement at the end. So I want the `markWord::displaced_mark_helper` to be consistent with `markWord::set_displaced_mark_helper`. > Does this fix any existing compiler warning/failure? I didn't notice the compiler warning, but the IDE `CLion` warns that the return statement is `unreachable code`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18272#issuecomment-1994715843 From duke at openjdk.org Wed Mar 13 15:52:24 2024 From: duke at openjdk.org (duke) Date: Wed, 13 Mar 2024 15:52:24 GMT Subject: Withdrawn: JDK-8314890: Reduce number of loads for Klass decoding in static code In-Reply-To: References: Message-ID: On Tue, 22 Aug 2023 15:48:40 GMT, Thomas Stuefe wrote: > Small change that reduces the number of loads generated by the C++ compiler for a narrow Klass decoding operation (`CompressedKlassPointers::decode_xxx()`. > > Stock: three loads (with two probably sharing a cache line) - UseCompressedClassPointers, encoding base and shift. > > > 8b7b62: 48 8d 05 7f 1b c3 00 lea 0xc31b7f(%rip),%rax # 14e96e8 > 8b7b69: 0f b6 00 movzbl (%rax),%eax > 8b7b6c: 84 c0 test %al,%al > 8b7b6e: 0f 84 9c 00 00 00 je 8b7c10 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> > 8b7b74: 48 8d 15 05 62 c6 00 lea 0xc66205(%rip),%rdx # 151dd80 <_ZN23CompressedKlassPointers6_shiftE> > 8b7b7b: 8b 7b 08 mov 0x8(%rbx),%edi > 8b7b7e: 8b 0a mov (%rdx),%ecx > 8b7b80: 48 8d 15 01 62 c6 00 lea 0xc66201(%rip),%rdx # 151dd88 <_ZN23CompressedKlassPointers5_baseE> > 8b7b87: 48 d3 e7 shl %cl,%rdi > 8b7b8a: 48 03 3a add (%rdx),%rdi > > > Patched: one load loads all three. Since shift occupies the lowest 8 bits, compiled code uses 8bit register; ditto the UseCompressedOops flag. > > > 8ba302: 48 8d 05 97 9c c2 00 lea 0xc29c97(%rip),%rax # 14e3fa0 <_ZN23CompressedKlassPointers6_comboE> > 8ba309: 48 8b 08 mov (%rax),%rcx > 8ba30c: f6 c5 01 test $0x1,%ch # use compressed klass pointers? > 8ba30f: 0f 84 9b 00 00 00 je 8ba3b0 <_ZN10HeapRegion14object_iterateEP13ObjectClosure+0x260> > 8ba315: 8b 7b 08 mov 0x8(%rbx),%edi > 8ba318: 48 d3 e7 shl %cl,%rdi # shift > 8ba31b: 66 31 c9 xor %cx,%cx # zero out lower 16 bits of base > 8ba31e: 48 01 cf add %rcx,%rdi # add base > 8ba321: 8b 4f 08 mov 0x8(%rdi),%ecx > > --- > > Performance measurements: > > G1, doing a full GC over a heap filled with 256 mio life j.l.Object instances. > > I see a reduction of Full Pause times between 1.2% and 5%. I am unsure how reliable these numbers are since, despite my efforts (running tests on isolated CPUs etc.), the standard deviation was quite high at ?4%. Still, in general, numbers seemed to go down rather than up. > > --- > > Future extensions: > > This patch uses the fact that the encoding base is aligned to metaspace reserve alignment (16 Mb). We only use 16 of those 24 bits of alignment shadow and could us... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15389 From coleenp at openjdk.org Wed Mar 13 16:11:15 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 13 Mar 2024 16:11:15 GMT Subject: RFR: 8328064: Remove obsolete comments in constantPool and metadataFactory In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 09:27:41 GMT, Yude Lin wrote: > Clean up some obsolete comments: > After 8243287, there is no longer the concept of "pseudo-string"; > After 8072061, the "read_only" argument that the comment is referring to is removed. > > Cheers Looks good and also trivial. Thanks for the cleanup. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18266#pullrequestreview-1934618914 From rrich at openjdk.org Wed Mar 13 16:43:12 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 13 Mar 2024 16:43:12 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:05:00 GMT, Richard Reingruber wrote: > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. @MBaesken found 2 more locations in jvmti that need switching to `WXWrite` JvmtiExport::get_jvmti_interface GetCarrierThread Both use `ThreadInVMfromNative`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1994930681 From dchuyko at openjdk.org Wed Mar 13 16:58:23 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 13 Mar 2024 16:58:23 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v29] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 07:43:35 GMT, Serguei Spitsyn wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 47 commits: >> >> - Resolved master conflicts >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - ... and 37 more: https://git.openjdk.org/jdk/compare/782206bc...ff39ac12 > > src/hotspot/share/ci/ciEnv.cpp line 1144: > >> 1142: >> 1143: if (entry_bci == InvocationEntryBci) { >> 1144: if (TieredCompilation) { > > Just a naive question. Why this check has been removed? We want to let replacement of C2 method version by another C2 version of the same method in both tired and non-tired mode, which was not allowed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1523618332 From dchuyko at openjdk.org Wed Mar 13 17:14:28 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 13 Mar 2024 17:14:28 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v30] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 48 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Resolved master conflicts - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 38 more: https://git.openjdk.org/jdk/compare/5cae7d20...22b42347 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=29 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From jbhateja at openjdk.org Wed Mar 13 17:26:15 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Wed, 13 Mar 2024 17:26:15 GMT Subject: RFR: 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation In-Reply-To: References: Message-ID: <9zmxLSpnOGMODpkCnJ94bDMvbp50KzwLWjhmb5R9lNk=.dbd611b7-3ee2-4a44-961f-564c37c16331@github.com> On Tue, 12 Mar 2024 19:11:28 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to fix the bug reported in #17881 where data from unused registers (rbx, rcx, rdx) is being copied. LGTM ------------- Marked as reviewed by jbhateja (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18248#pullrequestreview-1934789200 From dchuyko at openjdk.org Wed Mar 13 17:31:32 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 13 Mar 2024 17:31:32 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v31] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: No dots in -r descriptions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14111/files - new: https://git.openjdk.org/jdk/pull/14111/files/22b42347..36c30367 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=30 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=29-30 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From dchuyko at openjdk.org Wed Mar 13 17:34:24 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 13 Mar 2024 17:34:24 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v29] In-Reply-To: References: Message-ID: <1AM7yClR1fxPAHevEwcSaNI8hP-KM2oVTwYT41pyEo0=.06098a55-7e89-4214-bcbe-faef2965f4df@github.com> On Wed, 13 Mar 2024 07:48:35 GMT, Serguei Spitsyn wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 47 commits: >> >> - Resolved master conflicts >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - ... and 37 more: https://git.openjdk.org/jdk/compare/782206bc...ff39ac12 > > src/hotspot/share/services/diagnosticCommand.cpp line 928: > >> 926: DCmdWithParser(output, heap), >> 927: _filename("filename", "Name of the directives file", "STRING", true), >> 928: _refresh("-r", "Refresh affected methods.", "BOOLEAN", false, "false") { > > Nit: The dot is not needed at the end, I think. The same applies to lines: 945, 970 and 987. Thanks, the dots were removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1523664980 From sviswanathan at openjdk.org Wed Mar 13 18:33:13 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 13 Mar 2024 18:33:13 GMT Subject: RFR: 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 19:11:28 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to fix the bug reported in #17881 where data from unused registers (rbx, rcx, rdx) is being copied. Marked as reviewed by sviswanathan (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18248#pullrequestreview-1934928567 From duke at openjdk.org Wed Mar 13 18:37:17 2024 From: duke at openjdk.org (Srinivas Vamsi Parasa) Date: Wed, 13 Mar 2024 18:37:17 GMT Subject: Integrated: 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 19:11:28 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to fix the bug reported in #17881 where data from unused registers (rbx, rcx, rdx) is being copied. This pull request has now been integrated. Changeset: eb45d5bd Author: vamsi-parasa Committer: Sandhya Viswanathan URL: https://git.openjdk.org/jdk/commit/eb45d5bd644771887fc31a7abc2851c7dd37b3f4 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod 8327999: Remove copy of unused registers for cpu features check in x86_64 AVX2 Poly1305 implementation Reviewed-by: jbhateja, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/18248 From sspitsyn at openjdk.org Wed Mar 13 20:45:47 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 13 Mar 2024 20:45:47 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v31] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 16:55:28 GMT, Dmitry Chuyko wrote: >> src/hotspot/share/ci/ciEnv.cpp line 1144: >> >>> 1142: >>> 1143: if (entry_bci == InvocationEntryBci) { >>> 1144: if (TieredCompilation) { >> >> Just a naive question. Why this check has been removed? > > We want to let replacement of C2 method version by another C2 version of the same method in both tired and non-tired mode, which was not allowed Okay, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1523889164 From sspitsyn at openjdk.org Wed Mar 13 20:56:45 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 13 Mar 2024 20:56:45 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v31] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 17:31:32 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: > > No dots in -r descriptions The fix looks good. But I do not have an expertise in the compiler-specific part. So, a review from the Compiler team is still required. ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14111#pullrequestreview-1935188358 From dholmes at openjdk.org Wed Mar 13 22:02:40 2024 From: dholmes at openjdk.org (David Holmes) Date: Wed, 13 Mar 2024 22:02:40 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 13:39:46 GMT, Guoxiong Li wrote: >> Hi all, >> >> This trivial patch removes unnecessary return statement in method `markWord::displaced_mark_helper`. >> >> Thanks for taking the time to review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch 'master' into REMOVE_UNNECESSARY_CODE > - Remove unnecessary code. You would need to verify that this does not cause a build warning with all supported compilers on all platforms. As stated above these "fake" returns exist precisely because compilers (in the past at least) did not recognize that `fatal` is non-returning. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18272#issuecomment-1995947093 From syan at openjdk.org Thu Mar 14 02:07:37 2024 From: syan at openjdk.org (SendaoYan) Date: Thu, 14 Mar 2024 02:07:37 GMT Subject: RFR: 8327946: containers/docker/TestJFREvents.java fails when host kernel config vm.swappiness=0 after JDK-8325139 In-Reply-To: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> References: <5Q0X-rxAg9WKCnK-Qluu5hvyffsGwVgGJGRoA8XlBGs=.923c1bf8-e008-4af9-9929-6e5c1f2d5271@github.com> Message-ID: <6lsvEqQBJ30zo85jAYNYiq06S_wy-WfC8uO3UTNf2-Q=.974fc07d-e862-4a46-b16e-78e429e0aa01@github.com> On Tue, 12 Mar 2024 09:06:45 GMT, SendaoYan wrote: > According to the [docker document](https://docs.docker.com/config/containers/resource_constraints/#--memory-swappiness-details), the default value of --memory-swappiness is inherited from the host machine. So, when the the kernel config vm.swappiness=0 on the host machine, this testcase will fail, because of docker container can not use swap memory, the deafult value of --memory-swappiness is 0. > > When the host kernel config "vm.swappiness = 0", In order to run this testcase passed , there are two methods: > > 1. change `.shouldContain("totalSize = " + expectedTotalValue)` to `.shouldContain("totalSize = "`, which ignored the `expectedTotalValue`, because the `expectedTotalValue` could be 0(swap memroy is disable when --memory-swappiness=0) or could be 104857600(300MB-200MB=100MB), it depends on the host machine config `vm.swappiness` > 2. Change the default `--memory-swappiness` 0 to non-zero, such as 60. > 3. Change the host kernel config `vm.swappiness=0` to `vm.swappiness=60`. I think it's not a good idea. > > Maybe the 2rd method seems more resonable. Fix the testcase bug, the risk is low. Can anyone reivew this PR, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18225#issuecomment-1996258849 From gli at openjdk.org Thu Mar 14 03:23:37 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 14 Mar 2024 03:23:37 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 21:59:55 GMT, David Holmes wrote: > You would need to verify that this does not cause a build warning with all supported compilers on all platforms. Sorry, I don't have all the environments locally to verify. May need some help. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18272#issuecomment-1996317404 From dholmes at openjdk.org Thu Mar 14 05:36:37 2024 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Mar 2024 05:36:37 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 03:20:49 GMT, Guoxiong Li wrote: > Sorry, I don't have all the environments locally to verify. May need some help. Thanks. My point was a bit too subtle. Given that you can't verify all the compilers as needed, the change should not be made in my opinion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18272#issuecomment-1996561716 From duke at openjdk.org Thu Mar 14 06:23:43 2024 From: duke at openjdk.org (Yude Lin) Date: Thu, 14 Mar 2024 06:23:43 GMT Subject: Integrated: 8328064: Remove obsolete comments in constantPool and metadataFactory In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 09:27:41 GMT, Yude Lin wrote: > Clean up some obsolete comments: > After 8243287, there is no longer the concept of "pseudo-string"; > After 8072061, the "read_only" argument that the comment is referring to is removed. > > Cheers This pull request has now been integrated. Changeset: 6f2676dc Author: Yude Lin Committer: Denghui Dong URL: https://git.openjdk.org/jdk/commit/6f2676dc5f09d350c359f906b07f6f6d0d17f030 Stats: 3 lines in 2 files changed: 0 ins; 2 del; 1 mod 8328064: Remove obsolete comments in constantPool and metadataFactory Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/jdk/pull/18266 From duke at openjdk.org Thu Mar 14 06:25:44 2024 From: duke at openjdk.org (Xiaowei Lu) Date: Thu, 14 Mar 2024 06:25:44 GMT Subject: RFR: 8328138: Optimize ArrayEquals on AArch64 Message-ID: Current implementation of ArrayEquals on AArch64 is quite complex, due to the variety of checks about alignment, tail processing, bus locking and so on. However, Modern Arm processors have eased such worries. So we proposed to use a simple&straightforward flow of ArrayEquals. With this simplified ArrayEquals, we observed performance gains on the latest arm platforms(Neoverse N1&N2) Test case: org.openjdk.bench.java.util.ArraysEquals 1x vector length, 64-bit aligned array[0] | Test Case | N1 | N2 | |:----------------------:|:---------:|:---------:| | testByteFalseBeginning | -21.42% | -13.37% | | testByteFalseEnd | 25.79% | 27.45% | | testByteFalseMid | 16.64% | 16.46% | | testByteTrue | 12.39% | 24.66% | | testCharFalseBeginning | -5.27% | -3.08% | | testCharFalseEnd | 29.29% | 35.23% | | testCharFalseMid | 15.13% | 19.34% | | testCharTrue | 21.63% | 33.73% | | Total | 11.77% | 17.55% | A key factor is to decide when we should utilize simd in array equals. An aggressive choice is to enable simd as long as array length exceeds vector length(8 words). The corresponding result is shown above, from which we can see performance regression in both testBeginning cases. To avoid such perf impact, we can set simd threshold to 3x vector length. 3x vector length, 64-bit aligned array[0] | | n1 | n2 | |:----------------------:|:---------:|:---------:| | testByteFalseBeginning | 8.28% | 8.64% | | testByteFalseEnd | 6.38% | 12.29% | | testByteFalseMid | 6.17% | 7.96% | | testByteTrue | -10.08% | 3.06% | | testCharFalseBeginning | -1.42% | 7.23% | | testCharFalseEnd | 4.05% | 13.48% | | testCharFalseMid | 8.79% | 16.96% | | testCharTrue | -5.66% | 10.23% | | Total | 2.06% | 9.98% | In addtion to perf improvement, we propose this patch to solve alignment issues in array equals. JDK-8139457 tries to relax alignment of array elements. On the other hand, this misalignment makes it an error to read the whole last word in array equals, in case that the array doesn't occupy the whole word. Our patch fixed this problem, and again we would like to show the perf improvement when misalignment. 1x vector length, 32-bit aligned array[0] | | N1 | N2 | |:----------------------:|:---------:|:---------:| | testByteFalseBeginning | -12.96% | -17.50% | | testByteFalseEnd | 29.43% | 32.19% | | testByteFalseMid | 24.30% | 17.54% | | testByteTrue | 16.57% | 24.40% | | testCharFalseBeginning | 2.40% | 0.60% | | testCharFalseEnd | 32.14% | 32.94% | | testCharFalseMid | 18.86% | 17.60% | | testCharTrue | 25.38% | 32.62% | | Total | 17.01% | 17.54% | 3x vector length, 32-bit aligned array[0] | | N1 | N2 | |:----------------------:|:---------:|:---------:| | testByteFalseBeginning | 10.95% | 14.23% | | testByteFalseEnd | 13.83% | 12.35% | | testByteFalseMid | 11.18% | 10.31% | | testByteTrue | -7.13% | 6.61% | | testCharFalseBeginning | 0.38% | 7.20% | | testCharFalseEnd | 2.84% | 13.13% | | testCharFalseMid | 11.17% | 15.40% | | testCharTrue | -5.43% | 9.52% | | Total | 4.72% | 11.09% | ------------- Commit messages: - 8328138: Optimize ArrayEquals on AArch64 Changes: https://git.openjdk.org/jdk/pull/18292/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18292&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328138 Stats: 92 lines in 1 file changed: 92 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18292.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18292/head:pull/18292 PR: https://git.openjdk.org/jdk/pull/18292 From gli at openjdk.org Thu Mar 14 06:28:41 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 14 Mar 2024 06:28:41 GMT Subject: RFR: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 05:33:41 GMT, David Holmes wrote: > My point was a bit too subtle. Given that you can't verify all the compilers as needed, the change should not be made in my opinion. Got it. Thanks for your kindly discussion and suggestion. > The method `markWord::set_displaced_mark_helper` has no return statement at the end. So I want the `markWord::displaced_mark_helper` to be consistent with `markWord::set_displaced_mark_helper`. I notice the return value of the method `markWord::set_displaced_mark_helper` is `void`, but the method `markWord::displaced_mark_helper` has a not empty return value. It is the point I missed so that I submitted this wrong PR. It is my negligence. Apologize for wasting your time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18272#issuecomment-1996611599 From gli at openjdk.org Thu Mar 14 06:28:42 2024 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 14 Mar 2024 06:28:42 GMT Subject: Withdrawn: 8328081: Remove unnecessary return statement in method `markWord::displaced_mark_helper` In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 12:49:16 GMT, Guoxiong Li wrote: > Hi all, > > This trivial patch removes unnecessary return statement in method `markWord::displaced_mark_helper`. > > Thanks for taking the time to review. > > Best Regards, > -- Guoxiong This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/18272 From duke at openjdk.org Thu Mar 14 07:01:40 2024 From: duke at openjdk.org (Xiaowei Lu) Date: Thu, 14 Mar 2024 07:01:40 GMT Subject: RFR: 8328138: Optimize ArrayEquals on AArch64 In-Reply-To: References: Message-ID: <9YBpJZo8j7aYg12YmZ4f9dgNpIQ2fALI7gcG07Ju-wA=.45c9a7c9-5f9e-4156-95d8-a2de285bc95e@github.com> On Thu, 14 Mar 2024 06:21:56 GMT, Xiaowei Lu wrote: > Current implementation of ArrayEquals on AArch64 is quite complex, due to the variety of checks about alignment, tail processing, bus locking and so on. However, Modern Arm processors have eased such worries. So we proposed to use a simple&straightforward flow of ArrayEquals. > With this simplified ArrayEquals, we observed performance gains on the latest arm platforms(Neoverse N1&N2) > Test case: org.openjdk.bench.java.util.ArraysEquals > > 1x vector length, 64-bit aligned array[0] > | Test Case | N1 | N2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | -21.42% | -13.37% | > | testByteFalseEnd | 25.79% | 27.45% | > | testByteFalseMid | 16.64% | 16.46% | > | testByteTrue | 12.39% | 24.66% | > | testCharFalseBeginning | -5.27% | -3.08% | > | testCharFalseEnd | 29.29% | 35.23% | > | testCharFalseMid | 15.13% | 19.34% | > | testCharTrue | 21.63% | 33.73% | > | Total | 11.77% | 17.55% | > > A key factor is to decide when we should utilize simd in array equals. An aggressive choice is to enable simd as long as array length exceeds vector length(8 words). The corresponding result is shown above, from which we can see performance regression in both testBeginning cases. To avoid such perf impact, we can set simd threshold to 3x vector length. > > 3x vector length, 64-bit aligned array[0] > | | n1 | n2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | 8.28% | 8.64% | > | testByteFalseEnd | 6.38% | 12.29% | > | testByteFalseMid | 6.17% | 7.96% | > | testByteTrue | -10.08% | 3.06% | > | testCharFalseBeginning | -1.42% | 7.23% | > | testCharFalseEnd | 4.05% | 13.48% | > | testCharFalseMid | 8.79% | 16.96% | > | testCharTrue | -5.66% | 10.23% | > | Total | 2.06% | 9.98% | > > > In addtion to perf improvement, we propose this patch to solve alignment issues in array equals. JDK-8139457 tries to relax alignment of array elements. On the other hand, this misalignment makes it an error to read the whole last word in array equals, in case that the array doesn't occupy the whole word. Our patch fixed this problem, and again we would like to show the perf improvement when misalignment. > > 1x vector length, 32-bit aligned array[0] > | | N1 | N2 | > |:----------------------:|:---------:|:-... Anyone interested is welcome to test the performance gain, by turning on/off UseNewCode. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18292#issuecomment-1996677732 From ihse at openjdk.org Thu Mar 14 07:31:44 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 14 Mar 2024 07:31:44 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 Now the version upgrades are integrated. From my PoV, that was the interesting part. However, this PR is still open. Is there an interest in switching from C++14 to 17, or should this PR be retracted until a more unison will to upgrade C++ standard appears? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1996733804 From jzhu at openjdk.org Thu Mar 14 07:39:06 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Thu, 14 Mar 2024 07:39:06 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v2] In-Reply-To: References: Message-ID: > Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. > Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, > even the use of a floating point may cause the maximum 2048 bits stack occupied. > Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. > > In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 > > > ...... > 0x0000ffff684cfad8: stp x15, x18, [sp, #80] > 0x0000ffff684cfadc: sub sp, sp, #0x100 > 0x0000ffff684cfae0: str z16, [sp] > 0x0000ffff684cfae4: add x1, x13, #0x10 > 0x0000ffff684cfae8: mov x0, x16 > ;; 0xFFFF803F5414 > 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 > 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 > 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfaf8: blr x8 > 0x0000ffff684cfafc: mov x16, x0 > 0x0000ffff684cfb00: ldr z16, [sp] > 0x0000ffff684cfb04: add sp, sp, #0x100 > 0x0000ffff684cfb08: ptrue p7.b > 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] > ...... > > > could be optimized into: > > > ...... > 0x0000ffff684cfa50: stp x15, x18, [sp, #80] > 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() > 0x0000ffff684cfa58: add x1, x13, #0x10 > 0x0000ffff684cfa5c: mov x0, x16 > ;; 0xFFFF7FA942A8 > 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 > 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 > 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfa6c: blr x8 > 0x0000ffff684cfa70: mov x16, x0 > 0x0000ffff684cfa74: ldr d16, [sp], #16 > 0x0000ffff684cfa78: ptrue p7.b > 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] > ...... > > > Besides the above benefit, when we know what size of register is live, > we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. > > Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: Add jtreg test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17977/files - new: https://git.openjdk.org/jdk/pull/17977/files/3532ab4b..28ff7d2a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=00-01 Stats: 391 lines in 2 files changed: 391 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17977.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17977/head:pull/17977 PR: https://git.openjdk.org/jdk/pull/17977 From lmao at openjdk.org Thu Mar 14 08:01:00 2024 From: lmao at openjdk.org (Liang Mao) Date: Thu, 14 Mar 2024 08:01:00 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: References: Message-ID: <8yrJEa77_jawPRYzE7vos9kG2QAbu4yX-PTJgY6cX2A=.fe9206aa-9a79-46e5-9307-9c095b14ad03@github.com> On Thu, 22 Feb 2024 16:08:33 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment Hi Roman, I found a potential bug but didn't realized this PR was already integrated recently. Sorry for my negligence. It's a rare crash in aarch64 with G1 GC. The root cause is that default behavior of MacroAssembler::arrays_equals will blindly load whole word before comparison. When the array[0] is aligned to 32-bit, the last word load will exceed the array limit and may touch the next word beyong object layout in heap memory. If the next word which doesn't belong to object self happens to be the boundary of pages and G1 heap regions, the segmentation fault will be triggered. Loading the last word blindly is benign for 64-bit aligned array because it is always inside the object self. We proposed JDK-8328138 to optimize the aarch64 array equals implementation to both handle word aligned or unaligned array correctly and have better performance in ARM neoverse n1&n2 architectures. Apologize again for my delay. Please help to take a review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/11044#issuecomment-1996771480 From rkennke at openjdk.org Thu Mar 14 08:29:00 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 14 Mar 2024 08:29:00 GMT Subject: RFR: 8139457: Relax alignment of array elements [v69] In-Reply-To: <8yrJEa77_jawPRYzE7vos9kG2QAbu4yX-PTJgY6cX2A=.fe9206aa-9a79-46e5-9307-9c095b14ad03@github.com> References: <8yrJEa77_jawPRYzE7vos9kG2QAbu4yX-PTJgY6cX2A=.fe9206aa-9a79-46e5-9307-9c095b14ad03@github.com> Message-ID: On Thu, 14 Mar 2024 07:58:11 GMT, Liang Mao wrote: > Hi Roman, > > I found a potential bug but didn't realized this PR was already integrated recently. Sorry for my negligence. It's a rare crash in aarch64 with G1 GC. The root cause is that default behavior of MacroAssembler::arrays_equals will blindly load whole word before comparison. When the array[0] is aligned to 32-bit, the last word load will exceed the array limit and may touch the next word beyong object layout in heap memory. If the next word which doesn't belong to object self happens to be the boundary of pages and G1 heap regions, the segmentation fault will be triggered. Loading the last word blindly is benign for 64-bit aligned array because it is always inside the object self. We proposed JDK-8328138 to optimize the aarch64 array equals implementation to both handle word aligned or unaligned array correctly and have better performance in ARM neoverse n1&n2 architectures. Apologize again for my delay. Please help to take a review. Thanks for the heads-up, this is a very good point. Wouldn't we get wrong results for array-equals if we blindly compare the last word, if it doesn't actually belong to the array contents? ------------- PR Comment: https://git.openjdk.org/jdk/pull/11044#issuecomment-1996839809 From jzhu at openjdk.org Thu Mar 14 08:34:38 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Thu, 14 Mar 2024 08:34:38 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 07:39:06 GMT, Joshua Zhu wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: > > Add jtreg test case A jtreg test case is created for this commit. Please let me know if any further comments. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17977#issuecomment-1996852779 From tholenstein at openjdk.org Thu Mar 14 08:41:50 2024 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Thu, 14 Mar 2024 08:41:50 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v31] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 17:31:32 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: > > No dots in -r descriptions Looks good to me too (Compiler Team) ------------- Marked as reviewed by tholenstein (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14111#pullrequestreview-1936031298 From vitaly.provodin at jetbrains.com Thu Mar 14 08:49:04 2024 From: vitaly.provodin at jetbrains.com (Vitaly Provodin) Date: Thu, 14 Mar 2024 15:49:04 +0700 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: <5e4764c3-6675-4490-aa8f-9a5141f10277@oracle.com> References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> <5e4764c3-6675-4490-aa8f-9a5141f10277@oracle.com> Message-ID: <6C0488D6-45F2-4D88-84D4-994AC74746A2@jetbrains.com> Stefan, >> With the patch the version string is successfully printed - VM was not killed. > > Did you run this on 14.4? Did you also run this on an AArch64 machine? Both questions: Yes I can recheck it again (if it is needed). There maybe something wrong with my build... Vitaly > On 13. Mar 2024, at 13:27, Stefan Karlsson wrote: > > Hi Vitaly, > > On 2024-03-12 23:44, Vitaly Provodin wrote: >> Hi Stefan, Maxim, >> >> With the patch the version string is successfully printed - VM was not killed. > > Did you run this on 14.4? Did you also run this on an AArch64 machine? > > FWIW, I updated a machine to 14.4 and could reproduce the reported problem with a patch similar to the one below. I also managed to write a stand-alone C reproducer, which I have added to the bug report. > > Thanks, > StefanK > >> >> Thanks, >> Vitaly >> >> >>> On 12. Mar 2024, at 17:19, Stefan Karlsson wrote: >>> >>> Hi again, >>> >>> I just want to clarify that the change above just removes *one* usage of SafeFetch. My guess is that all other usages of SafeFetch is still affected by the issue you are seeing. >>> >>> It would be interesting to see if you could verify this problem by applying this patch: >>> >>> diff --git a/src/hotspot/share/memory/universe.cpp b/src/hotspot/share/memory/universe.cpp >>> index 2a3d532725f..b58c52e415b 100644 >>> --- a/src/hotspot/share/memory/universe.cpp >>> +++ b/src/hotspot/share/memory/universe.cpp >>> @@ -842,7 +842,14 @@ jint universe_init() { >>> return JNI_OK; >>> } >>> >>> +#include "runtime/safefetch.hpp" >>> + >>> jint Universe::initialize_heap() { >>> + char* mem = os::reserve_memory(16 * K, false, mtGC); >>> + intptr_t value = SafeFetchN((intptr_t*)mem, 1); >>> + >>> + log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, p2i(mem), value); >>> + >>> assert(_collectedHeap == nullptr, "Heap already created"); >>> _collectedHeap = GCConfig::arguments()->create_heap(); >>> >>> >>> and running: >>> build//jdk/bin/java -version >>> >>> This works on 14.3.1, but it would be interesting to see what happens on 14.4, and if that shuts down the VM before it is printing the version string. >>> >>> StefanK >>> >>> On 2024-03-12 09:34, Stefan Karlsson wrote: >>>> Hi Maxim, >>>> >>>> On 2024-03-12 08:12, Maxim Kartashev wrote: >>>>> Hello! >>>>> >>>>> This has been recently filed as https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to check with the community if any further info on the issue is available. >>>>> >>>>> It looks like in some cases SafeFetch is directed to fetch something the OS really doesn't want it to and, instead of promoting the error to a signal and letting the application (JVM) deal with it, immediately terminates the application. All we've got is the OS crash report, not even the JVM's fatal error log. This looks like an application security precaution, which I am not at all familiar with. >>>>> >>>>> The relevant pieces of the crash log are below. Is anybody familiar with "Namespace GUARD" termination reason and maybe other related novelties of macOS 14.4? The error was not reported before upgrading to 14.4 >>>> I don't have an answer for this, but a note below: >>>> >>>>> Thanks in advance, >>>>> >>>>> Maxim. >>>>> >>>>> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 >>>>> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 >>>>> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 >>>>> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, BasicLock*, JavaThread*) + 392 >>>> FYI: I think that this specific call to SafeFetch was recently removed by: >>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3__;!!ACWV5N9M2RV99hQ!MxH9PbSoMKT5I4e02Kkz2Sk1O3hStNxJuKBvsvK5GLi_k_rusbuYduHFzMUqPmzB5FZgTBpgdghtEhMW2QRCYe45o9VfzGctxA$ >>>> https://bugs.openjdk.org/browse/JDK-8320317 >>>> >>>> Cheers, >>>> StefanK >>>> >>>> >>>>> ... >>>>> >>>>> Exception Type: EXC_BAD_ACCESS (SIGKILL) >>>>> Exception Codes: KERN_PROTECTION_FAILURE at 0x00000001004f4000 >>>>> Exception Codes: 0x0000000000000002, 0x00000001004f4000 >>>>> >>>>> Termination Reason: Namespace GUARD, Code 5 >>>>> >>>>> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; bytes after start: 0 bytes before end: 16383 >>>>> REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL >>>>> mapped file 1004e4000-1004f4000 [ 64K] r--/r-- SM=COW Object_id=fa8d88e7 >>>>> ---> VM_ALLOCATE 1004f4000-1004f8000 [ 16K] ---/rwx SM=NUL >>>>> VM_ALLOCATE 1004f8000-1004fc000 [ 16K] r--/rwx SM=PRV -------------- next part -------------- An HTML attachment was scrubbed... URL: From mli at openjdk.org Thu Mar 14 08:55:51 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 14 Mar 2024 08:55:51 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF Message-ID: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Hi, Can you help to review this patch? Thanks This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: * at build time * with/without sleef * with/without sve support * at runtime * with/without sleef * with/without sve support [1] https://github.com/openjdk/jdk/pull/16234 ## Regression Test * test/jdk/jdk/incubator/vector/ * test/hotspot/jtreg/compiler/vectorapi/ ## Performance Test Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 ------------- Commit messages: - copyright date - distinguish call names between non-scalable and scalable(e.g. sve) - add libsleef as extra lib in CI cross-build on aarch64 - Merge branch 'master' into sleef-aarch64 - Fix potential attribute issue - Remove -fvisibility in makefile and add the attribute in source code - Add "--with-libsleef-lib" and "--with-libsleef-include" options - Separate neon and sve functions into two source files - Rename vmath to sleef in configure - Address review comments in build system - ... and 3 more: https://git.openjdk.org/jdk/compare/de428daf...55b118d9 Changes: https://git.openjdk.org/jdk/pull/18294/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312425 Stats: 600 lines in 23 files changed: 548 ins; 1 del; 51 mod Patch: https://git.openjdk.org/jdk/pull/18294.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18294/head:pull/18294 PR: https://git.openjdk.org/jdk/pull/18294 From mli at openjdk.org Thu Mar 14 08:56:51 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 14 Mar 2024 08:56:51 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9] In-Reply-To: References: Message-ID: <6Y1G5D_sGHAa3nOvQjxvOex9sNpwaEvLaTsgyODzwBU=.86b42ad7-0496-441d-aa5f-7fb1b3394bbe@github.com> On Fri, 1 Mar 2024 15:10:30 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix potential attribute issue > > Iirc, your assessment is right; the code should be ready for integration; I just don't know about the state of testing. Hi @magicus @theRealAph @dholmes-ora, I've created the new pr to continue the work at: https://github.com/openjdk/jdk/pull/18294. Please take another look. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1996902143 From luhenry at openjdk.org Thu Mar 14 09:02:38 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 14 Mar 2024 09:02:38 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF In-Reply-To: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: <98GhMjBc3vlkUZ4PFFzmPj41OW3eCb76nYV6goWC5VM=.36d55804-ea1f-4c1e-833e-55c14a84c6dd@github.com> On Thu, 14 Mar 2024 08:48:11 GMT, Hamlin Li wrote: > Hi, > Can you help to review this patch? > Thanks > > This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: > * at build time > * with/without sleef > * with/without sve support > * at runtime > * with/without sleef > * with/without sve support > > [1] https://github.com/openjdk/jdk/pull/16234 > > ## Regression Test > * test/jdk/jdk/incubator/vector/ > * test/hotspot/jtreg/compiler/vectorapi/ > > ## Performance Test > Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 Changes requested by luhenry (Committer). .github/workflows/build-cross-compile.yml line 138: > 136: --arch=${{ matrix.debian-arch }} > 137: --verbose > 138: --include=fakeroot,symlinks,build-essential,libx11-dev,libxext-dev,libxrender-dev,libxrandr-dev,libxtst-dev,libxt-dev,libcups2-dev,libfontconfig1-dev,libasound2-dev,libfreetype-dev,libpng-dev,${{ extra-libs }} This will have to be `${{ matrix.extra-libs }}` ------------- PR Review: https://git.openjdk.org/jdk/pull/18294#pullrequestreview-1936078175 PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1524483909 From mli at openjdk.org Thu Mar 14 09:14:04 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 14 Mar 2024 09:14:04 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: > Hi, > Can you help to review this patch? > Thanks > > This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: > * at build time > * with/without sleef > * with/without sve support > * at runtime > * with/without sleef > * with/without sve support > > [1] https://github.com/openjdk/jdk/pull/16234 > > ## Regression Test > * test/jdk/jdk/incubator/vector/ > * test/hotspot/jtreg/compiler/vectorapi/ > > ## Performance Test > Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: fix variable name in github workflow ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18294/files - new: https://git.openjdk.org/jdk/pull/18294/files/55b118d9..668d08ce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18294.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18294/head:pull/18294 PR: https://git.openjdk.org/jdk/pull/18294 From mli at openjdk.org Thu Mar 14 09:14:05 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 14 Mar 2024 09:14:05 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: <98GhMjBc3vlkUZ4PFFzmPj41OW3eCb76nYV6goWC5VM=.36d55804-ea1f-4c1e-833e-55c14a84c6dd@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <98GhMjBc3vlkUZ4PFFzmPj41OW3eCb76nYV6goWC5VM=.36d55804-ea1f-4c1e-833e-55c14a84c6dd@github.com> Message-ID: On Thu, 14 Mar 2024 08:59:09 GMT, Ludovic Henry wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix variable name in github workflow > > .github/workflows/build-cross-compile.yml line 138: > >> 136: --arch=${{ matrix.debian-arch }} >> 137: --verbose >> 138: --include=fakeroot,symlinks,build-essential,libx11-dev,libxext-dev,libxrender-dev,libxrandr-dev,libxtst-dev,libxt-dev,libcups2-dev,libfontconfig1-dev,libasound2-dev,libfreetype-dev,libpng-dev,${{ extra-libs }} > > This will have to be `${{ matrix.extra-libs }}` Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1524497586 From dchuyko at openjdk.org Thu Mar 14 09:22:47 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 14 Mar 2024 09:22:47 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v31] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 17:31:32 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: > > No dots in -r descriptions Thank you, Seguei and Tobias. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14111#issuecomment-1996995661 From mbaesken at openjdk.org Thu Mar 14 09:23:44 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 14 Mar 2024 09:23:44 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 16:40:33 GMT, Richard Reingruber wrote: > @MBaesken found 2 more locations in jvmti that need switching to `WXWrite` > > JvmtiExport::get_jvmti_interface GetCarrierThread > > Both use `ThreadInVMfromNative`. Should we address those 2 more findings in this PR ? Or open a separate JBS issue ? btw those were the jtreg tests triggering the 2 additional findings / asserts runtime/Thread/AsyncExceptionOnMonitorEnter.java runtime/Thread/AsyncExceptionTest.java serviceability/jvmti/RedefineClasses/RedefineSharedClassJFR.java runtime/handshake/HandshakeDirectTest.java runtime/handshake/SuspendBlocked.java runtime/jni/terminatedThread/TestTerminatedThread.java runtime/lockStack/TestStackWalk.java serviceability/jvmti/vthread/GetThreadState/GetThreadStateTest.java#default serviceability/jvmti/vthread/GetThreadState/GetThreadStateTest.java#no-vmcontinuations serviceability/jvmti/vthread/GetThreadStateMountedTest/GetThreadStateMountedTest.java serviceability/jvmti/vthread/RawMonitorTest/RawMonitorTest.java serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java#default serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java#xint serviceability/jvmti/vthread/ThreadStateTest/ThreadStateTest.java serviceability/jvmti/vthread/WaitNotifySuspendedVThreadTest/WaitNotifySuspendedVThreadTest.java ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1996996689 From dchuyko at openjdk.org Thu Mar 14 09:26:00 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 14 Mar 2024 09:26:00 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v32] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 50 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - No dots in -r descriptions - Merge branch 'openjdk:master' into compiler-directives-force-update - Resolved master conflicts - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 40 more: https://git.openjdk.org/jdk/compare/49ce85fa...eb4ed2ea ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=31 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From tschatzl at openjdk.org Thu Mar 14 09:56:02 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 14 Mar 2024 09:56:02 GMT Subject: RFR: 8289822: Make concurrent mark code owner of TAMSes [v2] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. > > The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. > > This also moves some code that is about marking decisions out of `HeapRegion`. > > In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. > I did not want to change this here. > > This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). > > Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. > > Testing: tier1-4, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Merge branch 'master' into 8289822-concurrent-mark-tams-owner - Do the mark information reset more fine grained (like before this change) to not potentially make HeapRegion::hr_clear() too slow - Fixes - some attempts to not have TAMSes always be updated for all regions - Remove _top_at_mark_start HeapRegion member - 8326781 initial version initial draft some improvmeents Make things work forgotten fix ------------- Changes: https://git.openjdk.org/jdk/pull/18150/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=01 Stats: 194 lines in 16 files changed: 73 ins; 68 del; 53 mod Patch: https://git.openjdk.org/jdk/pull/18150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18150/head:pull/18150 PR: https://git.openjdk.org/jdk/pull/18150 From ayang at openjdk.org Thu Mar 14 10:25:41 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 14 Mar 2024 10:25:41 GMT Subject: RFR: 8289822: Make concurrent mark code owner of TAMSes [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 09:56:02 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. >> >> The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. >> >> This also moves some code that is about marking decisions out of `HeapRegion`. >> >> In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. >> I did not want to change this here. >> >> This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). >> >> Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. >> >> Testing: tier1-4, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: > > - Merge branch 'master' into 8289822-concurrent-mark-tams-owner > - Do the mark information reset more fine grained (like before this change) to not potentially make HeapRegion::hr_clear() too slow > - Fixes > - some attempts to not have TAMSes always be updated for all regions > - Remove _top_at_mark_start HeapRegion member > - 8326781 > > initial version > > initial draft > > some improvmeents > > Make things work > > forgotten fix Maybe add `G1: ` in the title? src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 592: > 590: if (!_g1h->collector_state()->mark_or_rebuild_in_progress()) { > 591: return; > 592: } Seem unnecessary. src/hotspot/share/gc/g1/g1ConcurrentMark.hpp line 543: > 541: // Top pointer for each region at the start of marking. Must be valid for all committed > 542: // regions. > 543: HeapWord* volatile* _top_at_mark_starts; I wonder if one can use `HeapWord* volatile _top_at_mark_starts[];` to emphasize that it's an array. src/hotspot/share/gc/g1/g1ConcurrentMark.hpp line 569: > 567: > 568: inline HeapWord* top_at_mark_start(const HeapRegion* r) const; > 569: inline HeapWord* top_at_mark_start(uint region) const; Maybe call it `region_index` or sth alike? (In the impl of these methods, having `region` being the index is a bit confusing, IMO.) src/hotspot/share/memory/iterator.hpp line 44: > 42: // The following classes are C++ `closures` for iterating over objects, roots and spaces > 43: > 44: class Closure : public CHeapObj { }; Why is this required (in this PR)? ------------- PR Review: https://git.openjdk.org/jdk/pull/18150#pullrequestreview-1936235133 PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1524581727 PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1524590902 PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1524592065 PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1524599389 From kbarrett at openjdk.org Thu Mar 14 10:31:46 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 14 Mar 2024 10:31:46 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v6] In-Reply-To: References: Message-ID: On Thu, 11 Jan 2024 13:23:45 GMT, Julian Waters wrote: >> Compile the JDK as C++17, enabling the use of all C++17 language features > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > Require clang 13 in toolchain.m4 As I said earlier, I think the change to build with C++17 needs motivation, probably in the form of proposed changes to the style guide to describe new features whose use is permitted (or forbidden!) in HotSpot. There is also non-HotSpot C++ code in the JDK, and buy-in from the relevant teams will also be needed for that, or decide to limit the scope of the change. (I'm not really a fan of that, but mentioning it as an option.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1997122909 From rehn at openjdk.org Thu Mar 14 11:22:43 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Mar 2024 11:22:43 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 09:14:04 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix variable name in github workflow Hi, thanks for continuing with this. As this is similar to SVML, comments applies to x86 also: - There is no way to stop the VM from trying to load vmath ? There is both a UseNeon and a UseSVE, if I set both to false I would expect no vector and no vmath. The issue with UseNeon not really guarding neon code, but just crc, seems like a pre-existing bug. A flag like 'UseNeon' should turn it on/off :) - Doing open dll form stubrountines seems wrong. @dholmes-ora could I get your opinion here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1997219968 From stefan.karlsson at oracle.com Thu Mar 14 11:24:27 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Mar 2024 12:24:27 +0100 Subject: SafeFetch leads to hard crash on macOS 14.4 In-Reply-To: <6C0488D6-45F2-4D88-84D4-994AC74746A2@jetbrains.com> References: <8e8a8845-a81e-49c1-8b52-0e4b0cd6a0c0@oracle.com> <6252751b-a3aa-4910-8914-1d554bac6f79@oracle.com> <5e4764c3-6675-4490-aa8f-9a5141f10277@oracle.com> <6C0488D6-45F2-4D88-84D4-994AC74746A2@jetbrains.com> Message-ID: <0a935e75-cf40-48f7-9837-3ccb8e8539e5@oracle.com> On 2024-03-14 09:49, Vitaly Provodin wrote: > Stefan, > >>> With the patch the version string is successfully printed - VM was >>> not killed. >> >> Did you run this on 14.4? Did you also run this on an AArch64 machine? > > Both questions: Yes OK. Thanks. > I can recheck it again (if it is needed). There maybe something wrong > with my build... I think we have enough information at this point, so I don't think you need to dig further into this. Thanks for helping out. StefanK > > Vitaly > > >> On 13. Mar 2024, at 13:27, Stefan Karlsson >> wrote: >> >> Hi Vitaly, >> >> On 2024-03-12 23:44, Vitaly Provodin wrote: >>> Hi Stefan, Maxim, >>> >>> With the patch the version string is successfully printed - VM was >>> not killed. >> >> Did you run this on 14.4? Did you also run this on an AArch64 machine? >> >> FWIW, I updated a machine to 14.4 and could reproduce the reported >> problem with a patch similar to the one below. I also managed to >> write a stand-alone C reproducer, which I have added to the bug report. >> >> Thanks, >> StefanK >> >>> >>> Thanks, >>> Vitaly >>> >>> >>>> On 12. Mar 2024, at 17:19, Stefan Karlsson >>>> wrote: >>>> >>>> Hi again, >>>> >>>> I just want to clarify that the change above just removes *one* >>>> usage of SafeFetch. My guess is that all other usages of SafeFetch >>>> is still affected by the issue you are seeing. >>>> >>>> It would be interesting to see if you could verify this problem by >>>> applying this patch: >>>> >>>> diff --git a/src/hotspot/share/memory/universe.cpp >>>> b/src/hotspot/share/memory/universe.cpp >>>> index 2a3d532725f..b58c52e415b 100644 >>>> --- a/src/hotspot/share/memory/universe.cpp >>>> +++ b/src/hotspot/share/memory/universe.cpp >>>> @@ -842,7 +842,14 @@ jint universe_init() { >>>> ???return JNI_OK; >>>> ?} >>>> >>>> +#include "runtime/safefetch.hpp" >>>> + >>>> ?jint Universe::initialize_heap() { >>>> + ?char* mem = os::reserve_memory(16 * K, false, mtGC); >>>> + ?intptr_t value = SafeFetchN((intptr_t*)mem, 1); >>>> + >>>> + ?log_info(gc)("Reserved memory read: " PTR_FORMAT " " PTR_FORMAT, >>>> p2i(mem), value); >>>> + >>>> ???assert(_collectedHeap == nullptr, "Heap already created"); >>>> ???_collectedHeap = GCConfig::arguments()->create_heap(); >>>> >>>> >>>> and running: >>>> build//jdk/bin/java -version >>>> >>>> This works on 14.3.1, but it would be interesting to see what >>>> happens on 14.4, and if that shuts down the VM before it is >>>> printing the version string. >>>> >>>> StefanK >>>> >>>> On 2024-03-12 09:34, Stefan Karlsson wrote: >>>>> Hi Maxim, >>>>> >>>>> On 2024-03-12 08:12, Maxim Kartashev wrote: >>>>>> Hello! >>>>>> >>>>>> This has been recently filed as >>>>>> https://bugs.openjdk.org/browse/JDK-8327860 but I also wanted to >>>>>> check with the community if any further info on the issue is >>>>>> available. >>>>>> >>>>>> It looks like in some cases SafeFetch is directed to fetch >>>>>> something the OS really doesn't want it to and, instead of >>>>>> promoting the error to a signal and letting the application (JVM) >>>>>> deal with it, immediately terminates the application. All we've >>>>>> got is the OS crash report, not even the JVM's fatal error log. >>>>>> This looks like an application security precaution, which I am >>>>>> not at all familiar with. >>>>>> >>>>>> The relevant pieces of the crash log are below. Is anybody >>>>>> familiar with "Namespace GUARD" termination reason and maybe >>>>>> other related novelties of macOS 14.4? The error was not reported >>>>>> before upgrading to 14.4 >>>>> I don't have an answer for this, but a note below: >>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> Maxim. >>>>>> >>>>>> 0 libjvm.dylib 0x1062d6ec0 _SafeFetchN_fault + 0 >>>>>> 1 libjvm.dylib 0x1062331a4 ObjectMonitor::TrySpin(JavaThread*) + 408 >>>>>> 2 libjvm.dylib 0x106232b44 ObjectMonitor::enter(JavaThread*) + 228 >>>>>> 3 libjvm.dylib 0x10637436c ObjectSynchronizer::enter(Handle, >>>>>> BasicLock*, JavaThread*) + 392 >>>>> FYI: I think that this specific call to SafeFetch was recently >>>>> removed by: >>>>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/commit/29397d29baac3b29083b1b5d6b2cb06e456af0c3__;!!ACWV5N9M2RV99hQ!MxH9PbSoMKT5I4e02Kkz2Sk1O3hStNxJuKBvsvK5GLi_k_rusbuYduHFzMUqPmzB5FZgTBpgdghtEhMW2QRCYe45o9VfzGctxA$ >>>>> https://bugs.openjdk.org/browse/JDK-8320317 >>>>> >>>>> Cheers, >>>>> StefanK >>>>> >>>>> >>>>>> ... >>>>>> >>>>>> Exception Type: ???????EXC_BAD_ACCESS (SIGKILL) >>>>>> Exception Codes: ??????KERN_PROTECTION_FAILURE at 0x00000001004f4000 >>>>>> Exception Codes: ??????0x0000000000000002, 0x00000001004f4000 >>>>>> >>>>>> Termination Reason: ???Namespace GUARD, Code 5 >>>>>> >>>>>> VM Region Info: 0x1004f4000 is in 0x1004f4000-0x1004f8000; ?bytes >>>>>> after start: 0 ?bytes before end: 16383 >>>>>> ??????REGION TYPE ???????????????????START - END ????????[ VSIZE] >>>>>> PRT/MAX SHRMOD ?REGION DETAIL >>>>>> ??????mapped file ????????????????1004e4000-1004f4000 ???[ 64K] >>>>>> r--/r-- SM=COW ?Object_id=fa8d88e7 >>>>>> ---> ?VM_ALLOCATE ????????????????1004f4000-1004f8000 ???[ 16K] >>>>>> ---/rwx SM=NUL >>>>>> ??????VM_ALLOCATE ????????????????1004f8000-1004fc000 ???[ 16K] >>>>>> r--/rwx SM=PRV > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ihse at openjdk.org Thu Mar 14 12:15:08 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 14 Mar 2024 12:15:08 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 09:14:04 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix variable name in github workflow doc/building.md line 640: > 638: > 639: libsleef, the [SIMD Library for Evaluating Elementary Functions]( > 640: https://sleef.org/) is optional. But it will provide performance enhancement Suggestion: https://sleef.org/) is optional, but it will provide performance enhancement ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1524732845 From ihse at openjdk.org Thu Mar 14 12:28:10 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 14 Mar 2024 12:28:10 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 09:14:04 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix variable name in github workflow make/autoconf/flags-cflags.m4 line 994: > 992: # for SVE. Set SVE_CFLAGS to -march=armv8-a+sve if it does. Empty otherwise. > 993: # ACLE and this flag are required to build the Arm SVE related functions in > 994: # libvmath. Suggestion: # Check whether the compiler supports the Arm C Language Extensions (ACLE) # for SVE. Set SVE_CFLAGS to -march=armv8-a+sve if it does. # ACLE and this flag are required to build the aarch64 SVE related functions in # libvectormath. src/jdk.incubator.vector/linux/native/libvmath/vect_math.h line 1: > 1: /* I'd just like to raise the question of naming. Right now the terms "vmath", "vect_math" and "vector_math" seems to be used interchangeably. I think it would be good to standardize on one name, and my suggestion is to go with the complete name -- that fits with a general theme in Java. It has the benefit that if there are many ways to abbreviate a term, there is only one way to not abbreviate it. On the other hand, using `_` in library names is not really that common, so I'd suggest this file should be named `libvectormath/vector_math.h`. (And correspondingly for the other files.) src/jdk.incubator.vector/linux/native/libvmath/vect_math.h line 31: > 29: #endif > 30: > 31: #ifndef VMATH_EXPORT You can just use `JNIEXPORT` instead of trying to brew your own solution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1524748090 PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1524745947 PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1524746666 From dchuyko at openjdk.org Thu Mar 14 12:41:53 2024 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 14 Mar 2024 12:41:53 GMT Subject: Integrated: 8309271: A way to align already compiled methods with compiler directives In-Reply-To: References: Message-ID: On Wed, 24 May 2023 00:38:27 GMT, Dmitry Chuyko wrote: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... This pull request has now been integrated. Changeset: c879627d Author: Dmitry Chuyko URL: https://git.openjdk.org/jdk/commit/c879627dbd7e9295d44f19ef237edb5de10805d5 Stats: 381 lines in 15 files changed: 348 ins; 3 del; 30 mod 8309271: A way to align already compiled methods with compiler directives Reviewed-by: apangin, sspitsyn, tholenstein ------------- PR: https://git.openjdk.org/jdk/pull/14111 From rkennke at openjdk.org Thu Mar 14 12:45:39 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 14 Mar 2024 12:45:39 GMT Subject: RFR: 8328138: Optimize ArrayEquals on AArch64 & fix potential crash In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 06:21:56 GMT, Xiaowei Lu wrote: > Current implementation of ArrayEquals on AArch64 is quite complex, due to the variety of checks about alignment, tail processing, bus locking and so on. However, Modern Arm processors have eased such worries. Besides, we found crash when using lilliput. So we proposed to use a simple&straightforward flow of ArrayEquals. > With this simplified ArrayEquals, we observed performance gains on the latest arm platforms(Neoverse N1&N2) > Test case: org.openjdk.bench.java.util.ArraysEquals > > 1x vector length, 64-bit aligned array[0] > | Test Case | N1 | N2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | -21.42% | -13.37% | > | testByteFalseEnd | 25.79% | 27.45% | > | testByteFalseMid | 16.64% | 16.46% | > | testByteTrue | 12.39% | 24.66% | > | testCharFalseBeginning | -5.27% | -3.08% | > | testCharFalseEnd | 29.29% | 35.23% | > | testCharFalseMid | 15.13% | 19.34% | > | testCharTrue | 21.63% | 33.73% | > | Total | 11.77% | 17.55% | > > A key factor is to decide when we should utilize simd in array equals. An aggressive choice is to enable simd as long as array length exceeds vector length(8 words). The corresponding result is shown above, from which we can see performance regression in both testBeginning cases. To avoid such perf impact, we can set simd threshold to 3x vector length. > > 3x vector length, 64-bit aligned array[0] > | | n1 | n2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | 8.28% | 8.64% | > | testByteFalseEnd | 6.38% | 12.29% | > | testByteFalseMid | 6.17% | 7.96% | > | testByteTrue | -10.08% | 3.06% | > | testCharFalseBeginning | -1.42% | 7.23% | > | testCharFalseEnd | 4.05% | 13.48% | > | testCharFalseMid | 8.79% | 16.96% | > | testCharTrue | -5.66% | 10.23% | > | Total | 2.06% | 9.98% | > > > In addtion to perf improvement, we propose this patch to solve alignment issues in array equals. JDK-8139457 tries to relax alignment of array elements. On the other hand, this misalignment makes it an error to read the whole last word in array equals, in case that the array doesn't occupy the whole word and lilliput is enabled. A detailed explaination quoted from [https://github.com/openjdk/jdk/pull/11044#issuecomment-1996771480](url) > >> The root cause is that default behavior of MacroAss... I get the following results on a Graviton2, the columns are different vector-length multipliers. There doesn't seem to be a clear winner. 1 or 2 may be preferable. ? | 1 | 2 | 3 | 4 -- | -- | -- | -- | -- testByteFalseBeginning | -8.36% | 16.64% | 16.58% | 16.60% testByteFalseEnd | 30.76% | -0.01% | -0.01% | -0.01% testByteFalseMid | 23.42% | -0.01% | -0.01% | -0.35% testByteTrue | 11.53% | -23.09% | -23.09% | -23.08% testCharFalseBeginning | -0.02% | 0.00% | -2.30% | -0.06% testCharFalseEnd | 32.62% | 32.44% | 0.02% | 0.02% testCharFalseMid | 21.03% | 20.75% | 7.14% | 6.94% testCharTrue | 23.90% | 23.90% | -10.95% | -10.88% ------------- PR Comment: https://git.openjdk.org/jdk/pull/18292#issuecomment-1997371529 From mbaesken at openjdk.org Thu Mar 14 14:51:37 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 14 Mar 2024 14:51:37 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:05:00 GMT, Richard Reingruber wrote: > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. I noticed a few more asserts (assert(_wx_state == expected) failed: wrong state) in the jfr area (jdk tier3 jfr tests). E.g. V [libjvm.dylib+0x8a5d94] JavaThread::check_for_valid_safepoint_state()+0x0 V [libjvm.dylib+0x3e95b4] ThreadStateTransition::transition_from_native(JavaThread*, JavaThreadState, bool)+0x174 V [libjvm.dylib+0x3e93d0] ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x70 V [libjvm.dylib+0x91a578] JfrRecorderService::emit_leakprofiler_events(long long, bool, bool)+0xcc and V [libjvm.dylib+0x8a5d94] JavaThread::check_for_valid_safepoint_state()+0x0 V [libjvm.dylib+0x3e95b4] ThreadStateTransition::transition_from_native(JavaThread*, JavaThreadState, bool)+0x174 V [libjvm.dylib+0x3e93d0] ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x70 V [libjvm.dylib+0x8d7f74] JfrJavaEventWriter::flush(_jobject*, int, int, JavaThread*)+0xf8 j jdk.jfr.internal.JVM.flush(Ljdk/jfr/internal/event/EventWriter;II)V+0 jdk.jfr at 23-internal j jdk.jfr.internal.event.EventWriter.flush(II)V+3 jdk.jfr at 23-internal ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1997635467 From aph at openjdk.org Thu Mar 14 15:28:42 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 14 Mar 2024 15:28:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: <--oIyncVgEWC58CJipBbISrc1JiX1dr2nfZs8pDCOao=.5b4f156c-6207-4480-9408-596cbee2669f@github.com> On Thu, 14 Mar 2024 09:14:04 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix variable name in github workflow > * at build time > > * with/without sleef > * with/without sve support What is the relevance of SVE support at build time? Should it matter what the build machine is? Its important to realize that almost no one except the JDK devs builds their own JDK, and having SLEEF dependencies at build time will mean that almost no one will use it. All this work you've done will be for nothing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1997718999 From aph at openjdk.org Thu Mar 14 15:32:40 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 14 Mar 2024 15:32:40 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 11:20:26 GMT, Robbin Ehn wrote: > Hi, thanks for continuing with this. > > As this is similar to SVML, comments applies to x86 also: > > * There is no way to stop the VM from trying to load vmath ? > There is both a UseNeon and a UseSVE, if I set both to false I would expect no vector and no vmath. The JVM won't work without Neon. It's always there. Better to delete the flag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1997727704 From rkennke at openjdk.org Thu Mar 14 15:35:41 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 14 Mar 2024 15:35:41 GMT Subject: RFR: 8328138: Optimize ArrayEquals on AArch64 & fix potential crash In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 06:21:56 GMT, Xiaowei Lu wrote: > Current implementation of ArrayEquals on AArch64 is quite complex, due to the variety of checks about alignment, tail processing, bus locking and so on. However, Modern Arm processors have eased such worries. Besides, we found crash when using lilliput. So we proposed to use a simple&straightforward flow of ArrayEquals. > With this simplified ArrayEquals, we observed performance gains on the latest arm platforms(Neoverse N1&N2) > Test case: org.openjdk.bench.java.util.ArraysEquals > > 1x vector length, 64-bit aligned array[0] > | Test Case | N1 | N2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | -21.42% | -13.37% | > | testByteFalseEnd | 25.79% | 27.45% | > | testByteFalseMid | 16.64% | 16.46% | > | testByteTrue | 12.39% | 24.66% | > | testCharFalseBeginning | -5.27% | -3.08% | > | testCharFalseEnd | 29.29% | 35.23% | > | testCharFalseMid | 15.13% | 19.34% | > | testCharTrue | 21.63% | 33.73% | > | Total | 11.77% | 17.55% | > > A key factor is to decide when we should utilize simd in array equals. An aggressive choice is to enable simd as long as array length exceeds vector length(8 words). The corresponding result is shown above, from which we can see performance regression in both testBeginning cases. To avoid such perf impact, we can set simd threshold to 3x vector length. > > 3x vector length, 64-bit aligned array[0] > | | n1 | n2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | 8.28% | 8.64% | > | testByteFalseEnd | 6.38% | 12.29% | > | testByteFalseMid | 6.17% | 7.96% | > | testByteTrue | -10.08% | 3.06% | > | testCharFalseBeginning | -1.42% | 7.23% | > | testCharFalseEnd | 4.05% | 13.48% | > | testCharFalseMid | 8.79% | 16.96% | > | testCharTrue | -5.66% | 10.23% | > | Total | 2.06% | 9.98% | > > > In addtion to perf improvement, we propose this patch to solve alignment issues in array equals. JDK-8139457 tries to relax alignment of array elements. On the other hand, this misalignment makes it an error to read the whole last word in array equals, in case that the array doesn't occupy the whole word and lilliput is enabled. A detailed explaination quoted from [https://github.com/openjdk/jdk/pull/11044#issuecomment-1996771480](url) > >> The root cause is that default behavior of MacroAss... Same numbers on a Graviton 3 ? | 1 | 2 | 3 | 4 -- | -- | -- | -- | -- testByteFalseBeginning | -17.72% | 11.98% | 14.92% | 15.38% testByteFalseEnd | 43.48% | 10.67% | 9.97% | 10.05% testByteFalseMid | 26.66% | 0.16% | -0.61% | -0.66% testByteTrue | 29.84% | 2.06% | 2.61% | 3.70% testCharFalseBeginning | 5.49% | 7.30% | -7.13% | -14.30% testCharFalseEnd | 49.84% | 49.46% | 14.17% | 14.18% testCharFalseMid | 30.88% | 29.38% | 13.75% | 14.38% testCharTrue | 47.08% | 47.58% | 12.45% | 12.41% Looks better overall, vector-length * 2 looks best, IMO. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18292#issuecomment-1997736276 From aph at openjdk.org Thu Mar 14 15:35:41 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 14 Mar 2024 15:35:41 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 09:14:04 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix variable name in github workflow I'm running the tests, but I have no way to confirm that SLEEF or SVE is in use. Exactly what did you do, and how did you confirm it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1997735238 From tschatzl at openjdk.org Thu Mar 14 15:45:43 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 14 Mar 2024 15:45:43 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 10:16:45 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' into 8289822-concurrent-mark-tams-owner >> - Do the mark information reset more fine grained (like before this change) to not potentially make HeapRegion::hr_clear() too slow >> - Fixes >> - some attempts to not have TAMSes always be updated for all regions >> - Remove _top_at_mark_start HeapRegion member >> - 8326781 >> >> initial version >> >> initial draft >> >> some improvmeents >> >> Make things work >> >> forgotten fix > > src/hotspot/share/gc/g1/g1ConcurrentMark.hpp line 569: > >> 567: >> 568: inline HeapWord* top_at_mark_start(const HeapRegion* r) const; >> 569: inline HeapWord* top_at_mark_start(uint region) const; > > Maybe call it `region_index` or sth alike? (In the impl of these methods, having `region` being the index is a bit confusing, IMO.) I can do that separately for all `region` indexes used in the file. Having both at the same time does not look good either. > src/hotspot/share/memory/iterator.hpp line 44: > >> 42: // The following classes are C++ `closures` for iterating over objects, roots and spaces >> 43: >> 44: class Closure : public CHeapObj { }; > > Why is this required (in this PR)? The changes need to allocate `G1CollectedHeap::_is_alive_closure_cm` when creating that argument for creating the concurrent reference processor. Previously the closure used `G1CollectedHeap` which is available when inlining it into the `G1CollectedHeap` instance, but now the closure needs `G1ConcurrentMark` which is not available at that time (i.e. null). The alternative would have been introducing something like an `initialize` method to that closure which I did not like. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1525101929 PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1525105894 From tschatzl at openjdk.org Thu Mar 14 15:50:43 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 14 Mar 2024 15:50:43 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 10:15:53 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: >> >> - Merge branch 'master' into 8289822-concurrent-mark-tams-owner >> - Do the mark information reset more fine grained (like before this change) to not potentially make HeapRegion::hr_clear() too slow >> - Fixes >> - some attempts to not have TAMSes always be updated for all regions >> - Remove _top_at_mark_start HeapRegion member >> - 8326781 >> >> initial version >> >> initial draft >> >> some improvmeents >> >> Make things work >> >> forgotten fix > > src/hotspot/share/gc/g1/g1ConcurrentMark.hpp line 543: > >> 541: // Top pointer for each region at the start of marking. Must be valid for all committed >> 542: // regions. >> 543: HeapWord* volatile* _top_at_mark_starts; > > I wonder if one can use `HeapWord* volatile _top_at_mark_starts[];` to emphasize that it's an array. No, it would be interpreted as flexible array, which would need to be the last member of the class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1525112656 From tschatzl at openjdk.org Thu Mar 14 16:11:55 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 14 Mar 2024 16:11:55 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v3] In-Reply-To: References: Message-ID: <6IEcdm9U9rusVXNP98m0-5VTuwwUkMiZUr8NuBS_Wak=.cee36f07-9106-4961-9d8a-9d489c0ae1a4@github.com> > Hi all, > > please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. > > The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. > > This also moves some code that is about marking decisions out of `HeapRegion`. > > In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. > I did not want to change this here. > > This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). > > Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. > > Testing: tier1-4, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18150/files - new: https://git.openjdk.org/jdk/pull/18150/files/4c618dc7..f0afe153 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18150/head:pull/18150 PR: https://git.openjdk.org/jdk/pull/18150 From ayang at openjdk.org Thu Mar 14 17:49:38 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 14 Mar 2024 17:49:38 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 15:43:21 GMT, Thomas Schatzl wrote: >> src/hotspot/share/memory/iterator.hpp line 44: >> >>> 42: // The following classes are C++ `closures` for iterating over objects, roots and spaces >>> 43: >>> 44: class Closure : public CHeapObj { }; >> >> Why is this required (in this PR)? > > The changes need to allocate `G1CollectedHeap::_is_alive_closure_cm` when creating that argument for creating the concurrent reference processor. Previously the closure used `G1CollectedHeap` which is available when inlining it into the `G1CollectedHeap` instance, but now the closure needs `G1ConcurrentMark` which is not available at that time (i.e. null). > The alternative would have been introducing something like an `initialize` method to that closure which I did not like. OK, could `G1CMIsAliveClosure::do_object_b` retrieve `_cm` via `_g1h`? (It's obvious why `Closure` is part of mtGC here.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1525283735 From aph at openjdk.org Thu Mar 14 19:02:04 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 14 Mar 2024 19:02:04 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well Message-ID: This PR is a redesign of subtype checking. The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. So what's changed, so that the old design should be replaced? Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. The solution ------------ We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. It works like this: mov sub_klass, [& sub_klass->_bitmap] mov array_index, sub_klass shl array_index, ? jns not_found ? popcnt array_index,array_index ? mov array_base, [& sub_klass->_secondary_supers] ? cmp super_klass, [array_base+array_index*8] ? je found The popcount instruction returns the cardinality of a bitset. By shifting out the bits higher in number than the hash code of the element we're looking for, we leave only the lower bits. `popcount`, then, gives us the index of the element we're looking for. If we don't get a match at the first attempt, we test the next bit in the bitset and jump to a fallback stub: ? bt sub_klass, ?? jae not_found ?? ror sub_klass, ?? call Stub::klass_subtype_fallback ?? je found ?? Collisions are rare. Vladimir Ivanov did a survey of Java benchmark code, and it is very rare to see more than about 20 super-interfaces, and even that gives us a collision rate of only about 0.25. The time taken for a positive lookup is somewhere between 3 - 6 cycles, or about 0.9 - 1.8 ns. This is a robust figure, confirmed across current AArch64 and x86 designs, and this rate can be sustained indefinitely. Negative lookups are slightly faster because there's usually no need to consult the secondary super cache, at about 3 - 5 cycles. The current secondary super cache lookup is usually slightly faster than a hash table for positive lookups, at about 3 - 4 cycles, but it performs badly with negative lookups, and unless the class you're looking for is in the secondary super cache it performs badly as well. For example, a negative lookup in a class with 4 interfaces takes 10 - 47 cycles. Limitations and disadvantages ----------------------------- This PR is a subset of what is possible. It is only implemented for C2, in the cases where the superclass is a constant, known at compile time. Given that this is almost all uses of subtype checking, it's not really a restriction. I would have liked to save the hashed interfaces list in the CDS archive file, but there are test cases for the Serviceability Agent that assume the interfaces are in the order in which they are declared. I think the tests are probably just wrong about this. For the same reason, I have to make copies of the interfaces array and sort the copies, rather than just using the same array at runtime. I haven't removed the secondary super cache. It's still used by the interpreter and C2. In the future I'd like to delete the secondary super cache, but it is used in many places across the VM. Today is not that day. Performance testing ------------------- Hashing the secondary supers arrays takes a little time. I've added a perf counter for this, so you can see. It's only really a few milliseconds added to startup. I have run Renaissance and SPECjvm, and whatever speed differences there may be are immeasurably small, down in the noise. The benchmark in this class allows you to isolate single instanceof invocations with various sets of secondary superclasses. Finally ------- Vladimir Ivanov was very generous with his time and his advice. He explained the problem and went into detail about his experiments, and shared with me his experimental code. This saved me a great deal of time in this work. ------------- Commit messages: - Copyright dates - Merge branch 'JDK-8180450' of https://github.com/theRealAph/jdk into JDK-8180450 - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - Merge branch 'JDK-8180450' of https://github.com/theRealAph/jdk into JDK-8180450 - JDK-8180450: secondary_super_cache does not scale well - Comment typo - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well - ... and 38 more: https://git.openjdk.org/jdk/compare/2482a505...a0677b71 Changes: https://git.openjdk.org/jdk/pull/18309/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8180450 Stats: 1314 lines in 29 files changed: 1288 ins; 0 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From tschatzl at openjdk.org Thu Mar 14 19:22:38 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 14 Mar 2024 19:22:38 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v2] In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 17:46:46 GMT, Albert Mingkun Yang wrote: >> The changes need to allocate `G1CollectedHeap::_is_alive_closure_cm` when creating that argument for creating the concurrent reference processor. Previously the closure used `G1CollectedHeap` which is available when inlining it into the `G1CollectedHeap` instance, but now the closure needs `G1ConcurrentMark` which is not available at that time (i.e. null). >> The alternative would have been introducing something like an `initialize` method to that closure which I did not like. > > OK, could `G1CMIsAliveClosure::do_object_b` retrieve `_cm` via `_g1h`? (It's obvious why `Closure` is part of mtGC here.) At the cost of an additional indirection per reference. I would prefer to just have an `initialize` method then. You are right that just using `mtGC` is not appropriate here as it is used in non-GC code too. Is an `initialize` method okay with you then? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1525382122 From jzhu at openjdk.org Fri Mar 15 03:11:05 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Fri, 15 Mar 2024 03:11:05 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v3] In-Reply-To: References: Message-ID: > Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. > Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, > even the use of a floating point may cause the maximum 2048 bits stack occupied. > Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. > > In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 > > > ...... > 0x0000ffff684cfad8: stp x15, x18, [sp, #80] > 0x0000ffff684cfadc: sub sp, sp, #0x100 > 0x0000ffff684cfae0: str z16, [sp] > 0x0000ffff684cfae4: add x1, x13, #0x10 > 0x0000ffff684cfae8: mov x0, x16 > ;; 0xFFFF803F5414 > 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 > 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 > 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfaf8: blr x8 > 0x0000ffff684cfafc: mov x16, x0 > 0x0000ffff684cfb00: ldr z16, [sp] > 0x0000ffff684cfb04: add sp, sp, #0x100 > 0x0000ffff684cfb08: ptrue p7.b > 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] > ...... > > > could be optimized into: > > > ...... > 0x0000ffff684cfa50: stp x15, x18, [sp, #80] > 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() > 0x0000ffff684cfa58: add x1, x13, #0x10 > 0x0000ffff684cfa5c: mov x0, x16 > ;; 0xFFFF7FA942A8 > 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 > 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 > 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfa6c: blr x8 > 0x0000ffff684cfa70: mov x16, x0 > 0x0000ffff684cfa74: ldr d16, [sp], #16 > 0x0000ffff684cfa78: ptrue p7.b > 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] > ...... > > > Besides the above benefit, when we know what size of register is live, > we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. > > Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: change jtreg test case name ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17977/files - new: https://git.openjdk.org/jdk/pull/17977/files/28ff7d2a..382866f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=01-02 Stats: 726 lines in 2 files changed: 363 ins; 363 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17977.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17977/head:pull/17977 PR: https://git.openjdk.org/jdk/pull/17977 From rrich at openjdk.org Fri Mar 15 08:37:38 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 15 Mar 2024 08:37:38 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 14:49:12 GMT, Matthias Baesken wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > I noticed a few more asserts (assert(_wx_state == expected) failed: wrong state) in the jfr area (jdk tier3 jfr tests). > E.g. > > > V [libjvm.dylib+0x8a5d94] JavaThread::check_for_valid_safepoint_state()+0x0 > V [libjvm.dylib+0x3e95b4] ThreadStateTransition::transition_from_native(JavaThread*, JavaThreadState, bool)+0x174 > V [libjvm.dylib+0x3e93d0] ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x70 > V [libjvm.dylib+0x91a578] JfrRecorderService::emit_leakprofiler_events(long long, bool, bool)+0xcc > > > > and > > > V [libjvm.dylib+0x8a5d94] JavaThread::check_for_valid_safepoint_state()+0x0 > V [libjvm.dylib+0x3e95b4] ThreadStateTransition::transition_from_native(JavaThread*, JavaThreadState, bool)+0x174 > V [libjvm.dylib+0x3e93d0] ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x70 > V [libjvm.dylib+0x8d7f74] JfrJavaEventWriter::flush(_jobject*, int, int, JavaThread*)+0xf8 > j jdk.jfr.internal.JVM.flush(Ljdk/jfr/internal/event/EventWriter;II)V+0 jdk.jfr at 23-internal > j jdk.jfr.internal.event.EventWriter.flush(II)V+3 jdk.jfr at 23-internal > > @MBaesken found 2 more locations in jvmti that need switching to `WXWrite` > > ``` > > JvmtiExport::get_jvmti_interface > > GetCarrierThread > > ``` > > Both use `ThreadInVMfromNative`. > > Should we address those 2 more findings in this PR ? Or open a separate JBS issue ? > I'm leaning towards fixing all locations in this PR even though this will prevent clean backports. Would that be ok? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1999168860 From mbaesken at openjdk.org Fri Mar 15 08:43:38 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 15 Mar 2024 08:43:38 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 14:49:12 GMT, Matthias Baesken wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > I noticed a few more asserts (assert(_wx_state == expected) failed: wrong state) in the jfr area (jdk tier3 jfr tests). > E.g. > > > V [libjvm.dylib+0x8a5d94] JavaThread::check_for_valid_safepoint_state()+0x0 > V [libjvm.dylib+0x3e95b4] ThreadStateTransition::transition_from_native(JavaThread*, JavaThreadState, bool)+0x174 > V [libjvm.dylib+0x3e93d0] ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x70 > V [libjvm.dylib+0x91a578] JfrRecorderService::emit_leakprofiler_events(long long, bool, bool)+0xcc > > > > and > > > V [libjvm.dylib+0x8a5d94] JavaThread::check_for_valid_safepoint_state()+0x0 > V [libjvm.dylib+0x3e95b4] ThreadStateTransition::transition_from_native(JavaThread*, JavaThreadState, bool)+0x174 > V [libjvm.dylib+0x3e93d0] ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x70 > V [libjvm.dylib+0x8d7f74] JfrJavaEventWriter::flush(_jobject*, int, int, JavaThread*)+0xf8 > j jdk.jfr.internal.JVM.flush(Ljdk/jfr/internal/event/EventWriter;II)V+0 jdk.jfr at 23-internal > j jdk.jfr.internal.event.EventWriter.flush(II)V+3 jdk.jfr at 23-internal > > > @MBaesken found 2 more locations in jvmti that need switching to `WXWrite` > > > ``` > > > JvmtiExport::get_jvmti_interface > > > GetCarrierThread > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Both use `ThreadInVMfromNative`. > > > > > > Should we address those 2 more findings in this PR ? Or open a separate JBS issue ? > > I'm leaning towards fixing all locations in this PR even though this will prevent clean backports. Would that be ok? I think this is ok. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1999177986 From mbaesken at openjdk.org Fri Mar 15 08:49:38 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 15 Mar 2024 08:49:38 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: <8z-3qU5ADeGlrJd_DQ3H3kWItWG1DP8a-Xw-lbJOYaw=.b158ceec-94ea-4000-afe1-0083357deda3@github.com> On Tue, 12 Mar 2024 15:05:00 GMT, Richard Reingruber wrote: > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. `JfrRecorderService::emit_leakprofiler_events` (src/hotspot/share/jfr/recorder/service/jfrRecorderService.cpp ) and `JfrJavaEventWriter::flush` (src/hotspot/share/jfr/writers/jfrJavaEventWriter.cpp) might need adjustment too (see the other findings I posted yesterday). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1999185059 From aph at openjdk.org Fri Mar 15 09:53:37 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 15 Mar 2024 09:53:37 GMT Subject: RFR: 8328138: Optimize ArrayEquals on AArch64 & fix potential crash In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 06:21:56 GMT, Xiaowei Lu wrote: > Current implementation of ArrayEquals on AArch64 is quite complex, due to the variety of checks about alignment, tail processing, bus locking and so on. However, Modern Arm processors have eased such worries. Besides, we found crash when using lilliput. So we proposed to use a simple&straightforward flow of ArrayEquals. > With this simplified ArrayEquals, we observed performance gains on the latest arm platforms(Neoverse N1&N2) > Test case: org.openjdk.bench.java.util.ArraysEquals > > 1x vector length, 64-bit aligned array[0] > | Test Case | N1 | N2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | -21.42% | -13.37% | > | testByteFalseEnd | 25.79% | 27.45% | > | testByteFalseMid | 16.64% | 16.46% | > | testByteTrue | 12.39% | 24.66% | > | testCharFalseBeginning | -5.27% | -3.08% | > | testCharFalseEnd | 29.29% | 35.23% | > | testCharFalseMid | 15.13% | 19.34% | > | testCharTrue | 21.63% | 33.73% | > | Total | 11.77% | 17.55% | > > A key factor is to decide when we should utilize simd in array equals. An aggressive choice is to enable simd as long as array length exceeds vector length(8 words). The corresponding result is shown above, from which we can see performance regression in both testBeginning cases. To avoid such perf impact, we can set simd threshold to 3x vector length. > > 3x vector length, 64-bit aligned array[0] > | | n1 | n2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | 8.28% | 8.64% | > | testByteFalseEnd | 6.38% | 12.29% | > | testByteFalseMid | 6.17% | 7.96% | > | testByteTrue | -10.08% | 3.06% | > | testCharFalseBeginning | -1.42% | 7.23% | > | testCharFalseEnd | 4.05% | 13.48% | > | testCharFalseMid | 8.79% | 16.96% | > | testCharTrue | -5.66% | 10.23% | > | Total | 2.06% | 9.98% | > > > In addtion to perf improvement, we propose this patch to solve alignment issues in array equals. JDK-8139457 tries to relax alignment of array elements. On the other hand, this misalignment makes it an error to read the whole last word in array equals, in case that the array doesn't occupy the whole word and lilliput is enabled. A detailed explaination quoted from [https://github.com/openjdk/jdk/pull/11044#issuecomment-1996771480](url) > >> The root cause is that default behavior of MacroAss... I'm not entirely convinced this will perform better in real code. There's a lot of conditional branches at the end, and these need only to be used in the case of a very short array. ArraysEquals repeatedly tests the same array, leading to perfect branch prediction. In the real world this doesn't happen, so the test may be misleading. Array comparison, in the case of anything except a very short array, can be done by a sequence of wide loads, with a single (probably misaligned) load at the end instead of that sequence of conditional branches. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18292#issuecomment-1999300014 From mli at openjdk.org Fri Mar 15 11:31:08 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 11:31:08 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: > Hi, > Can you help to review this patch? > Thanks > > This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: > * at build time > * with/without sleef > * with/without sve support > * at runtime > * with/without sleef > * with/without sve support > > [1] https://github.com/openjdk/jdk/pull/16234 > > ## Regression Test > * test/jdk/jdk/incubator/vector/ > * test/hotspot/jtreg/compiler/vectorapi/ > > ## Performance Test > Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: - rename - resolve magicus's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18294/files - new: https://git.openjdk.org/jdk/pull/18294/files/668d08ce..7b7ec312 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=01-02 Stats: 63 lines in 7 files changed: 0 ins; 39 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/18294.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18294/head:pull/18294 PR: https://git.openjdk.org/jdk/pull/18294 From mli at openjdk.org Fri Mar 15 11:31:08 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 11:31:08 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: <4qYww49qPUHPd_x-Lqw9aoDYZTf1OYbkH8ldw_NNRxU=.bce40b28-b292-43e2-91a1-d2432e8c2057@github.com> On Thu, 14 Mar 2024 12:23:17 GMT, Magnus Ihse Bursie wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix variable name in github workflow > > src/jdk.incubator.vector/linux/native/libvmath/vect_math.h line 1: > >> 1: /* > > I'd just like to raise the question of naming. Right now the terms "vmath", "vect_math" and "vector_math" seems to be used interchangeably. I think it would be good to standardize on one name, and my suggestion is to go with the complete name -- that fits with a general theme in Java. It has the benefit that if there are many ways to abbreviate a term, there is only one way to not abbreviate it. > > On the other hand, using `_` in library names is not really that common, so I'd suggest this file should be named `libvectormath/vector_math.h`. (And correspondingly for the other files.) Thanks for the reviewing and suggestion. All Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1526144596 From mli at openjdk.org Fri Mar 15 11:31:08 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 11:31:08 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 15:29:51 GMT, Andrew Haley wrote: > Hi, thanks for continuing with this. Thanks for the comments > As this is similar to SVML, comments applies to x86 also: > > * There is no way to stop the VM from trying to load vmath ? No official way, but deleting libvmath.so will have a same effect. I'm not sure if avoiding loading vmath is necessary for typical users, if it turns out to be true, we can add it later. > There is both a UseNeon and a UseSVE, if I set both to false I would expect no vector and no vmath. > The issue with UseNeon not really guarding neon code, but just crc, seems like a pre-existing bug. > A flag like 'UseNeon' should turn it on/off :) I think Andrew asked this question, I agree it's better to remove the flag usage in crc32 on aarch64, better to be done in a separate [pr](https://bugs.openjdk.org/browse/JDK-8328265). > * Doing open dll form stubrountines seems wrong. I agree, logically it brings more clear code structure. But as this will be common change (e.g. on both x86 and aarch64) in the stub initialization steps, , and I think we'd better do it in a seprate [pr](https://bugs.openjdk.org/browse/JDK-8328266). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1999457249 From tschatzl at openjdk.org Fri Mar 15 11:34:06 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 15 Mar 2024 11:34:06 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v4] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. > > The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. > > This also moves some code that is about marking decisions out of `HeapRegion`. > > In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. > I did not want to change this here. > > This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). > > Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. > > Testing: tier1-4, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Use initialize method instead of dynamic allocation for G1CMIsAliveClosure. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18150/files - new: https://git.openjdk.org/jdk/pull/18150/files/f0afe153..652330cd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=02-03 Stats: 16 lines in 5 files changed: 11 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18150/head:pull/18150 PR: https://git.openjdk.org/jdk/pull/18150 From mli at openjdk.org Fri Mar 15 11:37:39 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 11:37:39 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: <--oIyncVgEWC58CJipBbISrc1JiX1dr2nfZs8pDCOao=.5b4f156c-6207-4480-9408-596cbee2669f@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <--oIyncVgEWC58CJipBbISrc1JiX1dr2nfZs8pDCOao=.5b4f156c-6207-4480-9408-596cbee2669f@github.com> Message-ID: On Thu, 14 Mar 2024 15:25:54 GMT, Andrew Haley wrote: > > ``` > > * at build time > > > > * with/without sleef > > * with/without sve support > > ``` > > What is the relevance of SVE support at build time? Should it matter what the build machine is? > > Its important to realize that almost no one except the JDK devs builds their own JDK, and having SLEEF dependencies at build time will mean that almost no one will use it. All this work you've done will be for nothing. I agree. On the other hand, I see previously there was discussion about this already, and current solution got some points https://github.com/openjdk/jdk/pull/16234#issuecomment-1836266314, https://github.com/openjdk/jdk/pull/16234#issuecomment-1836449876 So, currently maybe we could go with this solution first, and as an incremental optimization, we could introduce a dynamic loading without dependency on the bridge furnctions in libvmath at runtime. How do you think about it? > I'm running the tests, but I have no way to confirm that SLEEF or SVE is in use. Exactly what did you do, and how did you confirm it? Thanks for running test. I just turn on some log, and check the output. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1999468059 PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1999470911 From mli at openjdk.org Fri Mar 15 11:37:39 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 11:37:39 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Thu, 14 Mar 2024 15:29:51 GMT, Andrew Haley wrote: > > Hi, thanks for continuing with this. > > As this is similar to SVML, comments applies to x86 also: > > ``` > > * There is no way to stop the VM from trying to load vmath ? > > There is both a UseNeon and a UseSVE, if I set both to false I would expect no vector and no vmath. > > ``` > > The JVM won't work without Neon. It's always there. Better to delete the flag. Sure, will track this in a seperate [pr](https://bugs.openjdk.org/browse/JDK-8328265). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-1999469411 From ayang at openjdk.org Fri Mar 15 11:44:40 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 15 Mar 2024 11:44:40 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v4] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 11:34:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. >> >> The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. >> >> This also moves some code that is about marking decisions out of `HeapRegion`. >> >> In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. >> I did not want to change this here. >> >> This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). >> >> Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. >> >> Testing: tier1-4, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Use initialize method instead of dynamic allocation for G1CMIsAliveClosure. src/hotspot/share/memory/iterator.hpp line 44: > 42: // The following classes are C++ `closures` for iterating over objects, roots and spaces > 43: > 44: class Closure { }; Why is this required? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1526158942 From ihse at openjdk.org Fri Mar 15 11:58:39 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 15 Mar 2024 11:58:39 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: <1wNmKEbi_Jk4iTxFIQKqg4RB2B-H2Lg_2hQFrnbV3a8=.29891519-8217-4ac1-aa32-aaabe103e527@github.com> On Fri, 15 Mar 2024 11:31:08 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: > > - rename > - resolve magicus's comments src/jdk.incubator.vector/linux/native/libvectormath/vector_math_neon.c line 25: > 23: > 24: #include > 25: #include You should not include the "_md" (machine dependent) part of jni directly. Suggestion: #include src/jdk.incubator.vector/linux/native/libvectormath/vector_math_sve.c line 25: > 23: > 24: #include > 25: #include Same here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1526173179 PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1526174098 From sjayagond at openjdk.org Fri Mar 15 12:11:06 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Fri, 15 Mar 2024 12:11:06 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: Message-ID: > This PR Adds SIMD support on s390x. Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18162/files - new: https://git.openjdk.org/jdk/pull/18162/files/00b90ab8..8bb4bc33 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=00-01 Stats: 23 lines in 4 files changed: 3 ins; 12 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From sjayagond at openjdk.org Fri Mar 15 12:15:43 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Fri, 15 Mar 2024 12:15:43 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 12:11:06 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments @RealLucy and @TheRealMDoerr please review this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-1999538755 From sjayagond at openjdk.org Fri Mar 15 12:15:44 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Fri, 15 Mar 2024 12:15:44 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> References: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> Message-ID: On Fri, 8 Mar 2024 05:41:25 GMT, Amit Kumar wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/share/adlc/output_c.cpp line 2366: > >> 2364: if (strcmp(rep_var,"$VectorSRegister") == 0) return "as_VectorSRegister"; >> 2365: #endif >> 2366: #if defined(S390) > > can't we use `#elif` here ? Could be changed. I would keep it as is for now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1526199731 From aph at openjdk.org Fri Mar 15 13:15:59 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 15 Mar 2024 13:15:59 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v2] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8180450' of https://github.com/theRealAph/jdk into JDK-8180450 - JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/a0677b71..0db93b5b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=00-01 Stats: 128 lines in 1 file changed: 128 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From mdoerr at openjdk.org Fri Mar 15 13:42:44 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 15 Mar 2024 13:42:44 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> References: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> Message-ID: On Fri, 8 Mar 2024 05:41:50 GMT, Amit Kumar wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/share/opto/machnode.hpp line 146: > >> 144: } >> 145: #endif >> 146: #if defined(AARCH64) > > can't we use #elif here ? why? no other platform does that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1526306175 From mdoerr at openjdk.org Fri Mar 15 13:46:52 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 15 Mar 2024 13:46:52 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 12:11:06 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments src/hotspot/cpu/s390/s390.ad line 754: > 752: Z_VR31, Z_VR31_H, Z_VR31_J, Z_VR31_K, > 753: ); > 754: Why do you use only VR16 to VR31? If there is a reason, it should be explained in a comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1526310541 From mli at openjdk.org Fri Mar 15 13:58:05 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 13:58:05 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> > Hi, > Can you help to review this patch? > Thanks > > This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: > * at build time > * with/without sleef > * with/without sve support > * at runtime > * with/without sleef > * with/without sve support > > [1] https://github.com/openjdk/jdk/pull/16234 > > ## Regression Test > * test/jdk/jdk/incubator/vector/ > * test/hotspot/jtreg/compiler/vectorapi/ > > ## Performance Test > Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: fix jni includes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18294/files - new: https://git.openjdk.org/jdk/pull/18294/files/7b7ec312..d0ed0323 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18294&range=02-03 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18294.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18294/head:pull/18294 PR: https://git.openjdk.org/jdk/pull/18294 From mli at openjdk.org Fri Mar 15 13:58:05 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 13:58:05 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3] In-Reply-To: <1wNmKEbi_Jk4iTxFIQKqg4RB2B-H2Lg_2hQFrnbV3a8=.29891519-8217-4ac1-aa32-aaabe103e527@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <1wNmKEbi_Jk4iTxFIQKqg4RB2B-H2Lg_2hQFrnbV3a8=.29891519-8217-4ac1-aa32-aaabe103e527@github.com> Message-ID: On Fri, 15 Mar 2024 11:55:14 GMT, Magnus Ihse Bursie wrote: >> Hamlin Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - rename >> - resolve magicus's comments > > src/jdk.incubator.vector/linux/native/libvectormath/vector_math_neon.c line 25: > >> 23: >> 24: #include >> 25: #include > > You should not include the "_md" (machine dependent) part of jni directly. > > Suggestion: > > #include Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1526325331 From aph at openjdk.org Fri Mar 15 14:49:57 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 15 Mar 2024 14:49:57 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v3] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/0db93b5b..71655477 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=01-02 Stats: 26 lines in 1 file changed: 13 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From ihse at openjdk.org Fri Mar 15 14:53:33 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 15 Mar 2024 14:53:33 GMT Subject: RFR: 8328074: Add jcheck whitespace checking for assembly files [v4] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 11:59:36 GMT, Magnus Ihse Bursie wrote: >> As part of the ongoing effort to enable jcheck whitespace checking to all text files, it is now time to address assembly (.S) files. The hotspot assembly files were fixed as part of the hotspot mapfile removal, so only a single incorrect jdk library remains. >> >> The main issue with `libjsvml` was inconsistent line starts, sometimes using tabs. I used the `expand` unix command line tool to replace these with spaces. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Make all ALIGN/.align aligned Maybe someone in Hotspot wants to take a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18268#issuecomment-1999834049 From ihse at openjdk.org Fri Mar 15 14:56:09 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 15 Mar 2024 14:56:09 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes Now this looks good from a build perspective. Handling over to component team to review the actual code. ------------- Marked as reviewed by ihse (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18294#pullrequestreview-1939161561 From tholenstein at openjdk.org Fri Mar 15 14:15:27 2024 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Fri, 15 Mar 2024 14:15:27 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 10:00:02 GMT, Richard Reingruber wrote: > As I wrote in JBS, shouldn't this be handled by `ThreadInVMfromNative`? I agree. This is something I am investigating at the moment. Ideally, AssertWXAtThreadSync would also be true by default. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-1999726588 From mli at openjdk.org Fri Mar 15 15:45:56 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Mar 2024 15:45:56 GMT Subject: RFR: 8328264: AArch64: remove UseNeon condition in CRC32 intrinsic Message-ID: Hi, Can you review the simple patch? Thanks FYI: Discussed https://github.com/openjdk/jdk/pull/18294#issuecomment-1997727704, this usage of UseNeon flag should be removed, as neon is working by default, and UseNeon could be false, so this means the intrinsic code of Neon can be skipped unexpected. ## Performance Tested jmh `TestCRC32` with "-XX:-UseCRC32 -XX:-UseCryptoPmullForCRC32" Before Benchmark (count) Mode Cnt Score Error Units TestCRC32.testCRC32Update 64 avgt 2 53.696 ns/op TestCRC32.testCRC32Update 128 avgt 2 104.942 ns/op TestCRC32.testCRC32Update 256 avgt 2 207.147 ns/op TestCRC32.testCRC32Update 512 avgt 2 411.179 ns/op TestCRC32.testCRC32Update 2048 avgt 2 1608.388 ns/op TestCRC32.testCRC32Update 16384 avgt 2 12763.513 ns/op TestCRC32.testCRC32Update 65536 avgt 2 51024.246 ns/op After Benchmark (count) Mode Cnt Score Error Units TestCRC32.testCRC32Update 64 avgt 2 40.172 ns/op TestCRC32.testCRC32Update 128 avgt 2 56.754 ns/op TestCRC32.testCRC32Update 256 avgt 2 89.743 ns/op TestCRC32.testCRC32Update 512 avgt 2 156.726 ns/op TestCRC32.testCRC32Update 2048 avgt 2 579.776 ns/op TestCRC32.testCRC32Update 16384 avgt 2 4624.023 ns/op TestCRC32.testCRC32Update 65536 avgt 2 18505.180 ns/op ------------- Commit messages: - Initial commit Changes: https://git.openjdk.org/jdk/pull/18328/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18328&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328264 Stats: 126 lines in 1 file changed: 24 ins; 24 del; 78 mod Patch: https://git.openjdk.org/jdk/pull/18328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18328/head:pull/18328 PR: https://git.openjdk.org/jdk/pull/18328 From aph at openjdk.org Sat Mar 16 22:18:04 2024 From: aph at openjdk.org (Andrew Haley) Date: Sat, 16 Mar 2024 22:18:04 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes > > What is the relevance of SVE support at build time? Should it matter what the build machine is? > > Its important to realize that almost no one except the JDK devs builds their own JDK, and having SLEEF dependencies at build time will mean that almost no one will use it. All this work you've done will be for nothing. > > I agree. On the other hand, I see previously there was discussion about this already, and current solution got some points [#16234 (comment)](https://github.com/openjdk/jdk/pull/16234#issuecomment-1836266314), [#16234 (comment)](https://github.com/openjdk/jdk/pull/16234#issuecomment-1836449876) > > So, currently maybe we could go with this solution first, and as an incremental optimization, we could introduce a dynamic loading without dependency on the bridge furnctions in libvmath at runtime ([pr](https://bugs.openjdk.org/browse/JDK-8328268)). How do you think about it? If there was some kind of plan, with evidence of the intention to do something to get this valuable tech into people's hands in a form they can use, sure. But as you can tell, I think this may rot because no one will be able use it. If SLEEF were included in the JDK it'd be fine. if SLEEF were a common Linux library it'd be fine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2000073532 From sviswanathan at openjdk.org Sat Mar 16 22:18:13 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Sat, 16 Mar 2024 22:18:13 GMT Subject: RFR: 8328074: Add jcheck whitespace checking for assembly files [v4] In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 11:59:36 GMT, Magnus Ihse Bursie wrote: >> As part of the ongoing effort to enable jcheck whitespace checking to all text files, it is now time to address assembly (.S) files. The hotspot assembly files were fixed as part of the hotspot mapfile removal, so only a single incorrect jdk library remains. >> >> The main issue with `libjsvml` was inconsistent line starts, sometimes using tabs. I used the `expand` unix command line tool to replace these with spaces. > > Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: > > Make all ALIGN/.align aligned Looks good to me. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18268#pullrequestreview-1940824156 From aph at openjdk.org Sat Mar 16 22:18:09 2024 From: aph at openjdk.org (Andrew Haley) Date: Sat, 16 Mar 2024 22:18:09 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <--oIyncVgEWC58CJipBbISrc1JiX1dr2nfZs8pDCOao=.5b4f156c-6207-4480-9408-596cbee2669f@github.com> Message-ID: <7cK6tdczOn6KtAKWQdSOowZPaJEjO4MeU_rsmhHrU2M=.c1fe0ab2-9503-4618-ac7b-a4a5b4e01d93@github.com> On Fri, 15 Mar 2024 11:35:05 GMT, Hamlin Li wrote: > > I'm running the tests, but I have no way to confirm that SLEEF or SVE is in use. Exactly what did you do, and how did you confirm it? > > Thanks for running test. I just turn on some log, and check the output. The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2000076965 From tschatzl at openjdk.org Sat Mar 16 22:20:20 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Sat, 16 Mar 2024 22:20:20 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v5] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. > > The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. > > This also moves some code that is about marking decisions out of `HeapRegion`. > > In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. > I did not want to change this here. > > This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). > > Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. > > Testing: tier1-4, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18150/files - new: https://git.openjdk.org/jdk/pull/18150/files/652330cd..fae12ca0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18150/head:pull/18150 PR: https://git.openjdk.org/jdk/pull/18150 From tschatzl at openjdk.org Sat Mar 16 22:20:48 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Sat, 16 Mar 2024 22:20:48 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v4] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 11:42:17 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Use initialize method instead of dynamic allocation for G1CMIsAliveClosure. > > src/hotspot/share/memory/iterator.hpp line 44: > >> 42: // The following classes are C++ `closures` for iterating over objects, roots and spaces >> 43: >> 44: class Closure { }; > > Why is this required? I just forgot to undo to the original code... thanks for catching this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1526487415 From sspitsyn at openjdk.org Sat Mar 16 22:34:19 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 16 Mar 2024 22:34:19 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake Message-ID: The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. Testing: - Ran mach5 tiers 1-6 ------------- Commit messages: - added a couple of more fields to the JvmtiUnifiedHaandshakeClosure - 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake Changes: https://git.openjdk.org/jdk/pull/18332/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328285 Stats: 130 lines in 3 files changed: 35 ins; 74 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/18332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18332/head:pull/18332 PR: https://git.openjdk.org/jdk/pull/18332 From aph at openjdk.org Sat Mar 16 22:35:47 2024 From: aph at openjdk.org (Andrew Haley) Date: Sat, 16 Mar 2024 22:35:47 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/71655477..bd23d194 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=02-03 Stats: 22 lines in 1 file changed: 22 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From adinn at openjdk.org Sat Mar 16 22:35:49 2024 From: adinn at openjdk.org (Andrew Dinn) Date: Sat, 16 Mar 2024 22:35:49 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v3] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 14:49:57 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well I pre-reviewed this patch after discussing it with @theRealAph. I believe the implementation is solid and the redesign has many virtues, the major ones being 1. getting rid of the cache pinging 2. replacing it with something that is both *fast* and *simple* 3. evening up the positive and negative lookup costs. However, before giving the PR a review I'd prefer for someone else to try to pick holes in the approach. Also, I'd like to second Andrew's thanks to Vladimir Ivanov. As well as providing valuable input on how to improve on the current behaviour he also helped frame the problem clearly and clarify requirements for any solution, most importantly allowing ports to migrate piecemeal from current to new behaviour. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2000083439 From amitkumar at openjdk.org Mon Mar 18 02:15:27 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 18 Mar 2024 02:15:27 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> Message-ID: <7YtPVteupUbFSQf2WKqTcn7_jrhyi1EqU3pjT7WXYiw=.ad1cc8df-14e0-41d0-b5a3-1f5f120c39ca@github.com> On Fri, 15 Mar 2024 13:40:29 GMT, Martin Doerr wrote: >> src/hotspot/share/opto/machnode.hpp line 146: >> >>> 144: } >>> 145: #endif >>> 146: #if defined(AARCH64) >> >> can't we use #elif here ? > > why? no other platform does that. will there be any issue if we do: #elif defined(AARCH64) instead of #endif #if defined(AARCH64) The above one seems a bit better. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1527690087 From dholmes at openjdk.org Mon Mar 18 04:47:26 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 18 Mar 2024 04:47:26 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake In-Reply-To: References: Message-ID: <7rHrpXK-x-H4iHT160S5aUCGNwYJ7fbCMkFnTKo1_Nw=.47a4e0df-099c-45e2-aa26-50361f18dc70@github.com> On Fri, 15 Mar 2024 20:14:22 GMT, Serguei Spitsyn wrote: > The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 src/hotspot/share/prims/jvmtiEnvBase.cpp line 2004: > 2002: > 2003: hs_cl->set_target_jt(target_jt); // can be needed in the virtual thread case > 2004: hs_cl->set_is_virtual(is_virtual); // needed when suspend is required for non-current target thread Is this comment correct? Seems like it was just copied. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1527812762 From dholmes at openjdk.org Mon Mar 18 05:20:27 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 18 Mar 2024 05:20:27 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:35:47 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well I'm unclear why you are presenting this as a Diagnostic feature? I would expect either Experimental if you consider it early days and want more feedback; or else a full Product option that people can opt-in to using, and which eventually becomes the default. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2002947145 From sjayagond at openjdk.org Mon Mar 18 06:33:27 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 18 Mar 2024 06:33:27 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 13:43:53 GMT, Martin Doerr wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/cpu/s390/s390.ad line 754: > >> 752: Z_VR31, Z_VR31_H, Z_VR31_J, Z_VR31_K, >> 753: ); >> 754: > > Why do you use only VR16 to VR31? If there is a reason, it should be explained in a comment. > PPC64 has these explanations: > // 1st 32 VSRs are aliases for the FPRs which are already defined above. > // 2nd 32 VSRs are aliases for the VRs which are only defined here. > // VSR52-VSR63 // nv! > (s390 has no nv VRs which makes them easier to use.) In s390, 1st 16 VRs are aliases for the FPRs same as PPC. I will add proper explanation for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1527915833 From sjayagond at openjdk.org Mon Mar 18 08:40:27 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 18 Mar 2024 08:40:27 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: Message-ID: <1j_l5OsqxhFYKui-R5fbmn0ddP1pnm9XnLnnSQD4siU=.08916f2a-22f6-4a13-9362-5c1975432d37@github.com> On Mon, 18 Mar 2024 06:31:10 GMT, Sidraya Jayagond wrote: >> src/hotspot/cpu/s390/s390.ad line 754: >> >>> 752: Z_VR31, Z_VR31_H, Z_VR31_J, Z_VR31_K, >>> 753: ); >>> 754: >> >> Why do you use only VR16 to VR31? If there is a reason, it should be explained in a comment. >> PPC64 has these explanations: >> // 1st 32 VSRs are aliases for the FPRs which are already defined above. >> // 2nd 32 VSRs are aliases for the VRs which are only defined here. >> // VSR52-VSR63 // nv! >> (s390 has no nv VRs which makes them easier to use.) > > In s390, 1st 16 VRs are aliases for the FPRs same as PPC. I will add proper explanation for this. Since f0-f15 overlap with v0-v15, does register allocate takes care of not to allocate same float register to corresponding live vector register or vice-versa? e.g f1 and v1 registers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1528072522 From ayang at openjdk.org Mon Mar 18 09:01:27 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 18 Mar 2024 09:01:27 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v5] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:20:20 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. >> >> The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. >> >> This also moves some code that is about marking decisions out of `HeapRegion`. >> >> In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. >> I did not want to change this here. >> >> This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). >> >> Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. >> >> Testing: tier1-4, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review2 Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18150#pullrequestreview-1942305587 From ihse at openjdk.org Mon Mar 18 09:15:32 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 18 Mar 2024 09:15:32 GMT Subject: RFR: 8328074: Add jcheck whitespace checking for assembly files [v4] In-Reply-To: References: Message-ID: <02Y9O24SosYFQqROvNLZ6EM4lfieC5YJQ7zJ-cn087M=.1974283f-4ff3-48dd-b234-841d1ba263e1@github.com> On Sat, 16 Mar 2024 00:08:04 GMT, Sandhya Viswanathan wrote: >> Magnus Ihse Bursie has updated the pull request incrementally with one additional commit since the last revision: >> >> Make all ALIGN/.align aligned > > Looks good to me. Thanks @sviswa7! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18268#issuecomment-2003271932 From ihse at openjdk.org Mon Mar 18 09:15:32 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 18 Mar 2024 09:15:32 GMT Subject: Integrated: 8328074: Add jcheck whitespace checking for assembly files In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 11:18:20 GMT, Magnus Ihse Bursie wrote: > As part of the ongoing effort to enable jcheck whitespace checking to all text files, it is now time to address assembly (.S) files. The hotspot assembly files were fixed as part of the hotspot mapfile removal, so only a single incorrect jdk library remains. > > The main issue with `libjsvml` was inconsistent line starts, sometimes using tabs. I used the `expand` unix command line tool to replace these with spaces. This pull request has now been integrated. Changeset: c342188f Author: Magnus Ihse Bursie URL: https://git.openjdk.org/jdk/commit/c342188fd978dd94e7788fb0fb0345fd8c0eaa9a Stats: 280027 lines in 73 files changed: 0 ins; 0 del; 280027 mod 8328074: Add jcheck whitespace checking for assembly files Reviewed-by: erikj, sviswanathan ------------- PR: https://git.openjdk.org/jdk/pull/18268 From mdoerr at openjdk.org Mon Mar 18 09:16:29 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Mar 2024 09:16:29 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: <1j_l5OsqxhFYKui-R5fbmn0ddP1pnm9XnLnnSQD4siU=.08916f2a-22f6-4a13-9362-5c1975432d37@github.com> References: <1j_l5OsqxhFYKui-R5fbmn0ddP1pnm9XnLnnSQD4siU=.08916f2a-22f6-4a13-9362-5c1975432d37@github.com> Message-ID: On Mon, 18 Mar 2024 08:37:36 GMT, Sidraya Jayagond wrote: >> In s390, 1st 16 VRs are aliases for the FPRs same as PPC. I will add proper explanation for this. > > Since f0-f15 overlap with v0-v15, does register allocate takes care of not to allocate same float register to corresponding live vector register or vice-versa? e.g f1 and v1 registers. You use `ALLOC_IN_RC(z_v_reg)` in `vecX(),` so FP and VR are disjoint. That looks correct. Only the comments are missing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1528140708 From sjayagond at openjdk.org Mon Mar 18 09:47:32 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 18 Mar 2024 09:47:32 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v2] In-Reply-To: References: <1j_l5OsqxhFYKui-R5fbmn0ddP1pnm9XnLnnSQD4siU=.08916f2a-22f6-4a13-9362-5c1975432d37@github.com> Message-ID: On Mon, 18 Mar 2024 09:13:35 GMT, Martin Doerr wrote: >> Since f0-f15 overlap with v0-v15, does register allocate takes care of not to allocate same float register to corresponding live vector register or vice-versa? e.g f1 and v1 registers. > > You use `ALLOC_IN_RC(z_v_reg)` in `vecX(),` so FP and VR are disjoint. That looks correct. Only the comments are missing. Thanks @TheRealMDoerr , I will add comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1528212282 From rrich at openjdk.org Mon Mar 18 10:20:00 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 18 Mar 2024 10:20:00 GMT Subject: RFR: 8327990: [macosx-aarch64] JFR enters VM without WXWrite [v2] In-Reply-To: References: Message-ID: <9VajIlhw5cE9UPXdnzU1D5NcvRDNwLTKHa_G4kw5aU8=.1247da61-e7fa-4ecd-824d-987543ecbd89@github.com> > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Set WXWrite at more locations ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18238/files - new: https://git.openjdk.org/jdk/pull/18238/files/b34ad5f3..9075f4be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=00-01 Stats: 6 lines in 4 files changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18238.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18238/head:pull/18238 PR: https://git.openjdk.org/jdk/pull/18238 From rrich at openjdk.org Mon Mar 18 10:26:26 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 18 Mar 2024 10:26:26 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v2] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 13:58:15 GMT, Tobias Holenstein wrote: > > As I wrote in JBS, shouldn't this be handled by `ThreadInVMfromNative`? > > I agree. This is something I am investigating at the moment. Ideally, AssertWXAtThreadSync would also be true by default. I've added a bunch more locations we've seen when testing with AssertWXAtThreadSync. @tobiasholenstein would you think that this PR is actually not needed because you are going to push a better way of handling the WXMode in the near future? How should be handle the issues in released versions (jdk 21, 17, ...)? Will it be possible to backport your work? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2003496165 From aph at openjdk.org Mon Mar 18 10:38:28 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 18 Mar 2024 10:38:28 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 05:17:55 GMT, David Holmes wrote: > I'm unclear why you are presenting this as a Diagnostic feature? I would expect either Experimental if you consider it early days and want more feedback; or else a full Product option that people can opt-in to using, and which eventually becomes the default. The code for x86 and AArch64 is product ready, but other platforms aren't yet done. I think it should be enabled by default, but YMMV. The new -XX options will surely be needed by the maintainers of other platforms during porting. They may also be useful during the review phase of this patch, for reviewers to test before/after performance on their own systems. `UseSecondarySuperCache` and `HashSecondarySupers` perhaps make sense as (rather esoteric) product flags, but`VerifySecondarySupers` and `StressSecondarySuperHash` are strictly for port maintainers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2003538088 From sspitsyn at openjdk.org Mon Mar 18 11:10:26 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 18 Mar 2024 11:10:26 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake In-Reply-To: <7rHrpXK-x-H4iHT160S5aUCGNwYJ7fbCMkFnTKo1_Nw=.47a4e0df-099c-45e2-aa26-50361f18dc70@github.com> References: <7rHrpXK-x-H4iHT160S5aUCGNwYJ7fbCMkFnTKo1_Nw=.47a4e0df-099c-45e2-aa26-50361f18dc70@github.com> Message-ID: On Mon, 18 Mar 2024 04:43:02 GMT, David Holmes wrote: >> The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 2004: > >> 2002: >> 2003: hs_cl->set_target_jt(target_jt); // can be needed in the virtual thread case >> 2004: hs_cl->set_is_virtual(is_virtual); // needed when suspend is required for non-current target thread > > Is this comment correct? Seems like it was just copied. Good catch. It was a copy-paste of the wrong comment. Will fix it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1528332705 From ayang at openjdk.org Mon Mar 18 11:50:50 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 18 Mar 2024 11:50:50 GMT Subject: RFR: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY Message-ID: Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. Test: tier1-6 ------------- Commit messages: - g1-remove-DO_DISCOVERED_AND_DISCOVERY Changes: https://git.openjdk.org/jdk/pull/18346/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18346&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328350 Stats: 22 lines in 4 files changed: 0 ins; 22 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18346.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18346/head:pull/18346 PR: https://git.openjdk.org/jdk/pull/18346 From sjayagond at openjdk.org Mon Mar 18 12:01:54 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 18 Mar 2024 12:01:54 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v3] In-Reply-To: References: Message-ID: > This PR Adds SIMD support on s390x. Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: Add proper comments and cosmetic changes. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18162/files - new: https://git.openjdk.org/jdk/pull/18162/files/8bb4bc33..487c57db Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=01-02 Stats: 64 lines in 2 files changed: 11 ins; 1 del; 52 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From jwaters at openjdk.org Mon Mar 18 12:45:28 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 18 Mar 2024 12:45:28 GMT Subject: RFR: 8325932: Replace ATTRIBUTE_NORETURN with direct [[noreturn]] In-Reply-To: References: Message-ID: On Thu, 15 Feb 2024 09:10:51 GMT, Julian Waters wrote: > With clang 13 being the minimum required JDK-8325878, the noreturn bug that requires the ATTRIBUTE_NORETURN workaround now vanishes, and we can use [[noreturn]] directly within HotSpot. We should remove the workaround as soon as possible, given the chance @kimbarrett Is this good to go now that clang's minimum version has been set to 13? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17868#issuecomment-2003810705 From mdoerr at openjdk.org Mon Mar 18 12:51:26 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Mar 2024 12:51:26 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v3] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 12:01:54 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Add proper comments and cosmetic changes. src/hotspot/cpu/s390/s390.ad line 192: > 190: // ---------------------------- > 191: // 1st 16 VRs are aliases for the FPRs which are already defined above. > 192: reg_def Z_VR0 ( SOC, SOC, Op_RegF, 0, VMRegImpl::Bad()); I think `Op_VecX` should be used. The _H, _J, _K versions shouldn't be needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1528473402 From mdoerr at openjdk.org Mon Mar 18 13:01:30 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Mar 2024 13:01:30 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v3] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 12:01:54 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Add proper comments and cosmetic changes. src/hotspot/cpu/s390/globals_s390.hpp line 115: > 113: "Use Z15 Vector instructions for superword optimization.") \ > 114: product(bool, UseSFPV, false, \ > 115: "Use SFPV Vector instructions for superword optimization.") \ I'd probably make `UseSFPV` a diagnostic flag. Or do you think it needs to be an official flag which customers should set in production? (I think making `SuperwordUseVX` a regular product switch is fine as it is the fundamental switch for turning vectorization on or off.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1528488541 From rrich at openjdk.org Mon Mar 18 13:14:41 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 18 Mar 2024 13:14:41 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: Message-ID: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite - Set WXWrite at more locations - Switch to WXWrite before entering the vm ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18238/files - new: https://git.openjdk.org/jdk/pull/18238/files/9075f4be..1b4d7606 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=01-02 Stats: 293771 lines in 434 files changed: 6197 ins; 6221 del; 281353 mod Patch: https://git.openjdk.org/jdk/pull/18238.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18238/head:pull/18238 PR: https://git.openjdk.org/jdk/pull/18238 From smonteith at openjdk.org Mon Mar 18 14:40:27 2024 From: smonteith at openjdk.org (Stuart Monteith) Date: Mon, 18 Mar 2024 14:40:27 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v3] In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 03:11:05 GMT, Joshua Zhu wrote: >> Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. >> Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, >> even the use of a floating point may cause the maximum 2048 bits stack occupied. >> Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. >> >> In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 >> >> >> ...... >> 0x0000ffff684cfad8: stp x15, x18, [sp, #80] >> 0x0000ffff684cfadc: sub sp, sp, #0x100 >> 0x0000ffff684cfae0: str z16, [sp] >> 0x0000ffff684cfae4: add x1, x13, #0x10 >> 0x0000ffff684cfae8: mov x0, x16 >> ;; 0xFFFF803F5414 >> 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 >> 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 >> 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfaf8: blr x8 >> 0x0000ffff684cfafc: mov x16, x0 >> 0x0000ffff684cfb00: ldr z16, [sp] >> 0x0000ffff684cfb04: add sp, sp, #0x100 >> 0x0000ffff684cfb08: ptrue p7.b >> 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] >> ...... >> >> >> could be optimized into: >> >> >> ...... >> 0x0000ffff684cfa50: stp x15, x18, [sp, #80] >> 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() >> 0x0000ffff684cfa58: add x1, x13, #0x10 >> 0x0000ffff684cfa5c: mov x0, x16 >> ;; 0xFFFF7FA942A8 >> 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 >> 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 >> 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 >> 0x0000ffff684cfa6c: blr x8 >> 0x0000ffff684cfa70: mov x16, x0 >> 0x0000ffff684cfa74: ldr d16, [sp], #16 >> 0x0000ffff684cfa78: ptrue p7.b >> 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] >> ...... >> >> >> Besides the above benefit, when we know what size of register is live, >> we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. >> >> Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. > > Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: > > change jtreg test case name test/hotspot/jtreg/gc/z/TestRegistersPushPopAtZGCLoadBarrierStub.java line 291: > 289: String keyString = keyword + expected_number_of_push_pop_at_load_barrier_fregs + " " + expected_freg_type + " registers"; > 290: if (!containOnlyOneOccuranceOfKeyword(stdout, keyString)) { > 291: throw new RuntimeException("Stdout is expected to contain only one occurance of keyString: " + "'" + keyString + "'"); In the event of failure, would it be possible to print the erroneous output? The output from the subprocesses, being directly piped in, doesn't lend itself to easy debugging. At first I thought there might be an option that could alter OutputAnalyzers output, but sadly not. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17977#discussion_r1528691697 From iwalulya at openjdk.org Mon Mar 18 15:35:31 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Mon, 18 Mar 2024 15:35:31 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v5] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:20:20 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. >> >> The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. >> >> This also moves some code that is about marking decisions out of `HeapRegion`. >> >> In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. >> I did not want to change this here. >> >> This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). >> >> Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. >> >> Testing: tier1-4, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review2 LGTM! src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1508: > 1506: // * Reference discovery will not need a barrier. > 1507: > 1508: // Concurrent Mark ref processor The comment now seems misplaced ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18150#pullrequestreview-1943423814 PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1528778112 From jwaters at openjdk.org Mon Mar 18 21:36:26 2024 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 18 Mar 2024 21:36:26 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore In-Reply-To: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> References: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Message-ID: On Thu, 14 Mar 2024 19:02:17 GMT, Volodymyr Paprotski wrote: > Replace vpblendvp[sd] with macro assembler call and test in: > - `C2_MacroAssembler::vector_cast_float_to_int_special_cases_avx` (insufficient registers for 1 of 2 blends) > - `C2_MacroAssembler::vector_cast_double_to_int_special_cases_avx` > - `C2_MacroAssembler::vector_count_leading_zeros_int_avx` > > Functional testing with existing and new tests: > `make test TEST="test/hotspot/jtreg/compiler/vectorapi/reshape test/hotspot/jtreg/compiler/vectorization/runner/BasicIntOpTest.java"` > > Benchmarking with existing and new tests: > > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microFloat256ToInteger256" > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microDouble256ToInteger256" > make test TEST="micro:org.openjdk.bench.vm.compiler.VectorBitCount.WithSuperword.intLeadingZeroCount" > > > Performance before: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 17271.078 ? 184.140 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9310.507 ? 88.136 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11137.594 ? 19.009 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 5425.001 ? 3.136 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 0.994 ? 0.002 ops/us > > > Performance after: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 19222.048 ? 87.622 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9233.245 ? 123.493 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11672.806 ? 10.854 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 6009.735 ? 12.173 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 1.039 ? 0.004 ops/us src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3539: > 3537: vpor(dst, dst, scratch, vector_len); > 3538: } else { > 3539: Assembler::vblendvps(dst, src1, src2, mask, vector_len); This whitespace doesn't seem to be necessary ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18310#discussion_r1525843130 From duke at openjdk.org Mon Mar 18 21:36:25 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Mon, 18 Mar 2024 21:36:25 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore Message-ID: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Replace vpblendvp[sd] with macro assembler call and test in: - `C2_MacroAssembler::vector_cast_float_to_int_special_cases_avx` (insufficient registers for 1 of 2 blends) - `C2_MacroAssembler::vector_cast_double_to_int_special_cases_avx` - `C2_MacroAssembler::vector_count_leading_zeros_int_avx` Functional testing with existing and new tests: `make test TEST="test/hotspot/jtreg/compiler/vectorapi/reshape test/hotspot/jtreg/compiler/vectorization/runner/BasicIntOpTest.java"` Benchmarking with existing and new tests: make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microFloat256ToInteger256" make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microDouble256ToInteger256" make test TEST="micro:org.openjdk.bench.vm.compiler.VectorBitCount.WithSuperword.intLeadingZeroCount" Performance before: Benchmark (SIZE) Mode Cnt Score Error Units VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 17271.078 ? 184.140 ops/ms VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9310.507 ? 88.136 ops/ms VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11137.594 ? 19.009 ops/ms VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 5425.001 ? 3.136 ops/ms VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 0.994 ? 0.002 ops/us Performance after: Benchmark (SIZE) Mode Cnt Score Error Units VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 19222.048 ? 87.622 ops/ms VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9233.245 ? 123.493 ops/ms VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11672.806 ? 10.854 ops/ms VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 6009.735 ? 12.173 ops/ms VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 1.039 ? 0.004 ops/us ------------- Commit messages: - Fix whitespace - vpblend emulation continued Changes: https://git.openjdk.org/jdk/pull/18310/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320794 Stats: 65 lines in 5 files changed: 58 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18310.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18310/head:pull/18310 PR: https://git.openjdk.org/jdk/pull/18310 From dnsimon at openjdk.org Mon Mar 18 22:36:22 2024 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 18 Mar 2024 22:36:22 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:35:47 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Hi Andrew, we'd most likely want to port this to Graal. Just for my understanding, the current implementation of [secondary super cache lookup in Graal](https://github.com/oracle/graal/blob/0a471fc329ea2c8c18c05f9d83347dc1199c3ad9/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/replacements/InstanceOfSnippets.java#L169) will continue to work until we get around to doing the port right? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2005163282 From sspitsyn at openjdk.org Tue Mar 19 00:10:32 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 19 Mar 2024 00:10:32 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v2] In-Reply-To: References: Message-ID: > The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: correct one comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18332/files - new: https://git.openjdk.org/jdk/pull/18332/files/a31c4f71..78b7c894 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18332/head:pull/18332 PR: https://git.openjdk.org/jdk/pull/18332 From vlivanov at openjdk.org Tue Mar 19 02:45:21 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 19 Mar 2024 02:45:21 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:35:47 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Thanks a lot for taking care of secondary supers redesign, Andrew! Overall, I really like the balance you strike with the proposed approach. As I can see on the tests, it behaves very well for positive cases across the whole range of sizes (longest probe distance maxes at 4 on `org.openjdk.bench.vm.lang.IntfSubtype` microbenchmark). For negative cases it starts to quickly degrade after ~45 elements (I measured distance to the nearest zero in the bitmask) which looks like a good compromise for simplicity of hash function and single pass packing of the table. IMO it's important to cover reflection (non-constant) case and stubs. Also, it would be nice to align the implementation across all execution modes. But it can be covered as follow-up enhancements. I'll comment on the code changes separately. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2005645226 From vlivanov at openjdk.org Tue Mar 19 02:57:21 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 19 Mar 2024 02:57:21 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: <7GV5r_PRvhZkA1zh1WePp6crNNz9vGvxBfPqAG-YnUM=.fcb99891-456b-476a-ad7b-c7c045b3b254@github.com> On Mon, 18 Mar 2024 10:36:20 GMT, Andrew Haley wrote: > I'm unclear why you are presenting this as a Diagnostic feature? >From a user perspective, there should be no reason to specify `HashSecondarySupers` or `UseSecondarySuperCache` unless you diagnose a performance regression or looking for a workaround for a crash. There should be no other reason to override default value and switch to the old implementation once `HashSecondarySupers` is supported on a platform. IMO diagnostic flags are well justified here. Experimental flag doesn't cut it since the feature is turned on by default and making the flags product puts unnecessary obligations on the VM for something so obscure (from a user perspective). Some other VM flags followed the same practice (e.g., `UseVtableBasedCHA`) and so far it worked well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2005653912 From sjayagond at openjdk.org Tue Mar 19 02:59:20 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Tue, 19 Mar 2024 02:59:20 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v3] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 12:48:37 GMT, Martin Doerr wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Add proper comments and cosmetic changes. > > src/hotspot/cpu/s390/s390.ad line 192: > >> 190: // ---------------------------- >> 191: // 1st 16 VRs are aliases for the FPRs which are already defined above. >> 192: reg_def Z_VR0 ( SOC, SOC, Op_RegF, 0, VMRegImpl::Bad()); > > I think `Op_VecX` should be used. The _H, _J, _K versions shouldn't be needed. Thanks, Working on this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1529618542 From sjayagond at openjdk.org Tue Mar 19 03:04:20 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Tue, 19 Mar 2024 03:04:20 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v3] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 12:59:12 GMT, Martin Doerr wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Add proper comments and cosmetic changes. > > src/hotspot/cpu/s390/globals_s390.hpp line 115: > >> 113: "Use Z15 Vector instructions for superword optimization.") \ >> 114: product(bool, UseSFPV, false, \ >> 115: "Use SFPV Vector instructions for superword optimization.") \ > > I'd probably make `UseSFPV` a diagnostic flag. Or do you think it needs to be an official flag which customers should set in production? > (I think making `SuperwordUseVX` a regular product switch is fine as it is the fundamental switch for turning vectorization on or off.) I'll fix this. Basically `UseSFPV` is for restrict z13 not to use vector float constrictions as they are not available on it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1529623099 From xxinliu at amazon.com Tue Mar 19 04:30:10 2024 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 19 Mar 2024 04:30:10 +0000 Subject: Request for Comment: Add a stalling mode to asynchronous UL In-Reply-To: References: Message-ID: Hi, Johan, I don?t quite understand the new feature ?stalling?. On the other side, I think your new design is very reasonable. Gil Tene dubbed the time that Java applications do nothing ?hiccups?. From those services which are keen on GC pause in high-percentile such as P99.999, we found that one source of hiccups is unified logging at safepoints. There are many reasons unified logs are blocking. It is possible that log writes are throttled by the write bandwidth of disk. I think the main motivation of -Xlog:async is non-blocking, not leveraging multi-core. When log write is blocked, it should not block Java threads or VMThread. In your stall mode, my understanding is that your log site is still blocking. You have to wait for AsyncLogWriter when it is trapped in ?msg.output->write_blocking?, right? If we would stall rather than drop messages , why do we still choose async-logging in the first place? On the other side, the circular buffer looks promising. I compare your change(pull/17757) and its base(71b46c38a82) using your cmd "slow-debug/jdk/bin/java -Xlog:async". For stdout, I use "slowdebug/jdk/bin/java -Xlog:async -Xlog:all=trace |& tee stdout-$i.log". I ran 20 iterations and compare their differences using T-test. I found that there are significant differences for both stdout/stderr and regular file. For stdout, the mean of dropped message reduces 2.9%. For file, the mean of dropped message reduces 86.8%. I see that your approach is much more effective than ping-pong buffer when IO write is fast. Write a file is faster than stdout/stderr. Shall you post PR for broader reviews. Thanks, --lx From: Thomas St?fe Date: Friday, February 16, 2024 at 6:56 AM To: Johan Sjolen Cc: "hotspot-dev at openjdk.org" , "Liu, Xin" Subject: RE: [EXTERNAL] Request for Comment: Add a stalling mode to asynchronous UL CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. CC'ing Xin as original author. At first glance, I like this. I also like the ringbuffer approach better. Not sure if it had been discussed originally, but the original PR went on for a while IIRC. Do we even need the "drop" mode? I never liked the idea of losing output, and maybe stalls are that rare that they won't matter. Arguably, if you stall you should increase the log buffer size. ..Thomas On Fri, Feb 16, 2024 at 3:45?PM Johan Sjolen wrote: Hello Hotspot, Today, asynchronous UL depends on off-loading the work of logging to output devices by having log sites copying messages onto a buffer and an output thread swapping it with an empty buffer and emptying it.? If the current buffer runs out of room, then any attempt at logging leads to the message being dropped, that is: The message is not printed at all. I'd like to suggest that we add a stalling mode to async UL where in no message dropping occurs and all log sites stall until there is sufficient room in the buffer for their respective message. My purpose with this e-mail is twofold. First, I would like to know whether there is any interest in such a feature being added to Hotspot. Second, I'd like to discuss a change in the design of async UL which will permit this feature and other improvements. I imagine that the user facing change for async UL would be to equip -Xlog:async with an option, as such: ``` $ java -Xlog:async:drop ?... # Drop messages $ java -Xlog:async:stall ... # Stall log sites $ java -Xlog:async ? ? ? ... # Drop messages by default ``` The rest of this e-mail will describe the design changes and motivation behind these changes, so if you're mostly interested in answering whether or not you want there to be a stalling mode in async UL then you don't need to bother with the rest ?. There's a draft PR for these changes, it runs on my Linux computer, but I have implemented the necessary changes for Mac and Windows also. The PR is available here: https://github.com/openjdk/jdk/pull/17757 I have described the current async protocol which is in mainline in appendix 0. There are two issues with this protocol: 1. During times when there is a high amount of logging in the system there is also high contention on the mutex. This can lead the output thread to suffer from starvation, as it cannot take the mutex to swap the buffers. ? ?This in turn leads to excessive dropping of messages and poor scaling of buffer increases. 2. Implementing stalling (as opposed to dropping messages) under this protocol leads to unnecessary waiting on the logsites. ? ?Stalling would mean that a log site has to wait for the entire log buffer to be emptied before logging. ? ?This may mean 1MiB of data having to be output before the swap can occur, and of course the swap requires the writing thread to acquire the mutex which has its own issues. Issue number 1 can most easily be observed by inducing a (perhaps artificially so) high amount of logging through setting logging to all=trace in a slow-debug build: ``` $ slow-debug/jdk/bin/java -Xlog:async -Xlog:all=trace:file=asynctest::filecount=0 $ grep 'dropped due to' asynctest [0.147s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 1338 messages dropped due to async logging [0.171s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?496 messages dropped due to async logging [0.511s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ? 98 messages dropped due to async logging [0.542s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ? 13 messages dropped due to async logging [0.762s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 5620 messages dropped due to async logging [1.286s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?12440 messages dropped due to async logging [1.316s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2610 messages dropped due to async logging [1.352s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2497 messages dropped due to async logging ``` I'd like to suggest that we re-implement async logging using a producer/consumer protocol backed by circular buffers instead. This means that the output thread no longer suffers from starvation and can continously make progress. It also means that we don't have to split up the buffer memory in two, both log threads and output thread can share the whole buffer at the same time. My solution does seem to improve the dropping situation, as per the same test: ``` $ ul/jdk/bin/java -Xlog:async:drop -Xlog:all=trace:file=asyncdrop::filecount=0 $ grep 'dropped due to' asyncdrop [0.295s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?195 messages dropped due to async logging [0.949s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3012 messages dropped due to async logging [1.315s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 7992 messages dropped due to async logging ``` That is a bit over 10 000 dropped messages, as opposed to over 20 000 in the old solution. I also did some throughput benchmarking, which shows that stalling has minimal impact on throughput, see appendix 2 for the throughput timings. The new protocol is a basic producer/consumer pattern with two mutexes and a circular buffer with two variables: head and tail. The consumer reads tail and sets head, the producer reads head and writes to tail. In this case the producer are the log sites and the consumer is the output thread. See appendix 1 for a pseudo-code description of the protocol. There are some details ommitted, such as how stalling is implemented (hint: wait/notify on the producer lock), please read the code for details. TL;DR: 1. Add a 'stall' strategy for the async logging so that we can guarantee 0 dropped messages at the expense of delays at log sites, this is exposed by changing the command line option. Before: -Xlog:async, after: -Xlog:async[:[drop|stall]]. Default is still drop. 2. Ping-pong buffers are not a good idea when also supporting stalling, because a log site can be stalled for far longer than in the synchronous case. 3. Therefore switch out ping-pong buffers and single mutex with a circular buffer and separate producer and consumer mutexes. This also has the benefit of reducing dropped messages in my testing. Thank you for reading and I appreciate any input on this. Johan Sj?l?n Appendix 0, the old async protocol. Nota bene: Any absence or presence of bugs may be the opposite in the actual source code. ``` Variables: Mutex m; Buffer* cb; Buffer a; Buffer b; const char* message; Methods: void Buffer::write(const char* message) { ? if (has_space(strlen(message))) { ? ? internal_write(message); ? } } LogSite: 1. m.lock(); 2. cb->write(message); 3. m.unlock(); 4. m.notify(); OutputThread: 1. m.wait(); 2. m.lock(); 3. buffer := cb; 4. if cb == &a then cb = &b; else cb = &a; 5. m.unlock(); 6. for each message in buffer: print to log site 7. go to 1. ``` Appendix 1, the new protocol. Nota bene: Any absence or presence of bugs may be the opposite in the actual source code. ``` Mutex consumer; Mutex producer; CircularBuffer cb; size_t head; size_t tail; const char* message; char* out; LogSite: 1. producer.lock(); 2. t = tail; h = head; // Read the values 2. if unused_memory(h, t) < strlen(message) then drop_or_stall(); 3. size_t bytes_written = cb.write(message); // Writes string and envelope 4. tail = (t + bytes_written) % cb.size(); // Message ready to consume, move tail. 5. producer.unlock(); OutputThread: 1. consumer.lock(); 2. t = tail; h = head; 3. if h == t return; // No message available 4. size_t bytes_read = cb.read_bytes(out, h) // Write the next message 5. head = (h + bytes_read) % cb.size() // More memory available for producer 6. consumer.unlock(); ``` Appendix 2, the throughput tests. ``` Testing requires the tool pv, example of how to try it yourself: $ ./build/slow/jdk/bin/java -Xlog:async -Xlog:all=trace | (pv --rate --timer --bytes > /dev/null) # Slow debug build ## Default sync logging ### /dev/null 48.5MiB 0:00:02 [17.5MiB/s] ### To file 48.5MiB 0:00:03 [14.4MiB/s] ## Old async logging ### /dev/null 51.1MiB 0:00:02 [20.9MiB/s] ### To file 57.1MiB 0:00:02 [19.7MiB/s] Dropped messages: [0.202s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?350 messages dropped due to async logging [1.302s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 1981 messages dropped due to async logging [1.328s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2193 messages dropped due to async logging [1.354s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2464 messages dropped due to async logging [1.379s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4158 messages dropped due to async logging [1.405s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3017 messages dropped due to async logging [1.435s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3818 messages dropped due to async logging [1.479s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2286 messages dropped due to async logging ## New async logging (dropping mode) ### /dev/null 50.6MiB 0:00:02 [20.9MiB/s] ### To file 50.8MiB 0:00:02 [18.1MiB/s] Dropped messages: 0 on this run. Did another run as there is high variability in dropped messages: 50.2MiB 0:00:02 [18.3MiB/s] [1.276s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4942 messages dropped due to async logging ## New async logging (stalling mode) ### /dev/null 51.3MiB 0:00:02 [19.9MiB/s] ### To file 50.9MiB 0:00:02 [18.8MiB/s] # Release build Linux-x64 (to a tmpfs mounted file) ## New async drop 20,1MiB 0:00:00 [43,3MiB/s] 0 dropped messages ## New async stall 33,7MiB 0:00:00 [41,5MiB/s] N/A dropped messages 0 dropped messages ## Old async 17,7MiB 0:00:00 [42,2MiB/s] [0.059s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 6860 messages dropped due to async logging [0.082s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?20601 messages dropped due to async logging [0,107s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 9147 messages dropped due to async logging [0,131s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?11113 messages dropped due to async logging [0,154s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?11845 messages dropped due to async logging [0,178s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10383 messages dropped due to async logging [0,197s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 9376 messages dropped due to async logging [0,216s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10997 messages dropped due to async logging [0,234s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 5978 messages dropped due to async logging [0,252s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10439 messages dropped due to async logging [0,270s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 6562 messages dropped due to async logging [0,325s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?391 messages dropped due to async logging [0,343s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3532 messages dropped due to async logging [0,362s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3268 messages dropped due to async logging [0,395s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4182 messages dropped due to async logging [0,411s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2490 messages dropped due to async logging ``` From dholmes at openjdk.org Tue Mar 19 04:40:22 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Mar 2024 04:40:22 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm I still feel that this should be fixed inside the `ThreadInVMfromNative` transition - the number of callsites that need it just reinforces that for me. Granted we then need to look at where we would now have redundant calls. That said we have had a lot of people looking at the overall WX state management logic in the past week or so due to https://bugs.openjdk.org/browse/JDK-8327860. The workaround there requires us to be in EXEC mode and there are a number of folk who are questioning why "we" chose WRITE mode as the default with a switch to EXEC, instead of EXEC as the default with a switch to WRITE. But whichever way that goes I think the VM state transitions are the places to enforce that choice. ------------- PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1945158370 From dholmes at openjdk.org Tue Mar 19 04:49:20 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Mar 2024 04:49:20 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:35:47 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Sorry I've misread/misunderstood the role of `UseSecondarySuperCache`. I thought it was to opt-in to this new code but it isn't. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2005753721 From david.holmes at oracle.com Tue Mar 19 05:02:49 2024 From: david.holmes at oracle.com (David Holmes) Date: Tue, 19 Mar 2024 15:02:49 +1000 Subject: Request for Comment: Add a stalling mode to asynchronous UL In-Reply-To: References: Message-ID: On 19/03/2024 2:30 pm, Liu, Xin wrote: > If we would stall rather than drop messages , why do we still choose > async-logging in the first place? Because you want the general non-blocking behaviour that you get when the producers don't outstrip the consumer. But when the producers do outstrip the consumer then you have to decide what is more important: the log message, or the non-blocking behaviour? That is a policy choice the programmer should be allowed to make. The current implementation always drops the message. The new implementation allows you to drop the non-blocking behaviour i.e stall. Such policy choices are common whenever you inherently have a producer/consumer bounded queue. See for example: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/RejectedExecutionHandler.html used by: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/ThreadPoolExecutor.html where the policies include: - aborting - processing the item directly instead of enqueuing - deleting the oldest item in the queue to make room - dropping the new item Cheers, David On 19/03/2024 2:30 pm, Liu, Xin wrote: > Hi, Johan, > > I don?t quite understand the new feature ?stalling?. On the > other side, I think your new design is very reasonable. > > Gil Tene dubbed the time that Java applications do nothing > ?hiccups?. From those services which are keen on GC pause in > high-percentile such as P99.999, we found that one source of hiccups > is unified logging at safepoints. There are many reasons unified logs > are blocking. It is possible that log writes are throttled by the > write bandwidth of disk. I think the main motivation of -Xlog:async is > non-blocking, not leveraging multi-core. When log write is blocked, it > should not block Java threads or VMThread. In your stall mode, my > understanding is that your log site is still blocking. You have to wait for > AsyncLogWriter when it is trapped in ?msg.output->write_blocking?, > right? If we would stall rather than drop messages , why do we still choose > async-logging in the first place? > > On the other side, the circular buffer looks promising. I compare your > change(pull/17757) and its base(71b46c38a82) using your cmd > "slow-debug/jdk/bin/java -Xlog:async". > > For stdout, I use "slowdebug/jdk/bin/java -Xlog:async -Xlog:all=trace > |& tee stdout-$i.log". > > I ran 20 iterations and compare their differences using > T-test. I found that there are significant differences for both stdout/stderr > and regular file. For stdout, the mean of dropped message reduces > 2.9%. For file, the mean of dropped message reduces 86.8%. > > I see that your approach is much more effective than ping-pong buffer > when IO write is fast. Write a file is faster than stdout/stderr. Shall you post > PR for broader reviews. > > Thanks, > --lx > > > From: Thomas St?fe > Date: Friday, February 16, 2024 at 6:56 AM > To: Johan Sjolen > Cc: "hotspot-dev at openjdk.org" , "Liu, Xin" > Subject: RE: [EXTERNAL] Request for Comment: Add a stalling mode to asynchronous UL > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > CC'ing Xin as original author. > > At first glance, I like this. I also like the ringbuffer approach better. Not sure if it had been discussed originally, but the original PR went on for a while IIRC. > > Do we even need the "drop" mode? I never liked the idea of losing output, and maybe stalls are that rare that they won't matter. Arguably, if you stall you should increase the log buffer size. > > ..Thomas > > > On Fri, Feb 16, 2024 at 3:45?PM Johan Sjolen wrote: > Hello Hotspot, > > Today, asynchronous UL depends on off-loading the work of logging to output devices by having log sites copying messages onto a buffer and an output thread swapping it with an empty buffer and emptying it. > If the current buffer runs out of room, then any attempt at logging leads to the message being dropped, that is: The message is not printed at all. > I'd like to suggest that we add a stalling mode to async UL where in no message dropping occurs and all log sites stall until there is sufficient room in the buffer for their respective message. > > My purpose with this e-mail is twofold. First, I would like to know whether there is any interest in such a feature being added to Hotspot. Second, I'd like to discuss a change in the design of async UL which will permit this feature and other improvements. > I imagine that the user facing change for async UL would be to equip -Xlog:async with an option, as such: > > ``` > $ java -Xlog:async:drop ?... # Drop messages > $ java -Xlog:async:stall ... # Stall log sites > $ java -Xlog:async ? ? ? ... # Drop messages by default > ``` > > The rest of this e-mail will describe the design changes and motivation behind these changes, so if you're mostly interested in answering whether or not you want there to be a stalling mode in async UL then you don't need to bother with the rest ?. > > There's a draft PR for these changes, it runs on my Linux computer, but I have implemented the necessary changes for Mac and Windows also. The PR is available here: https://github.com/openjdk/jdk/pull/17757 > > I have described the current async protocol which is in mainline in appendix 0. There are two issues with this protocol: > > 1. During times when there is a high amount of logging in the system there is also high contention on the mutex. This can lead the output thread to suffer from starvation, as it cannot take the mutex to swap the buffers. > ? ?This in turn leads to excessive dropping of messages and poor scaling of buffer increases. > 2. Implementing stalling (as opposed to dropping messages) under this protocol leads to unnecessary waiting on the logsites. > ? ?Stalling would mean that a log site has to wait for the entire log buffer to be emptied before logging. > ? ?This may mean 1MiB of data having to be output before the swap can occur, and of course the swap requires the writing thread to acquire the mutex which has its own issues. > > Issue number 1 can most easily be observed by inducing a (perhaps artificially so) high amount of logging through setting logging to all=trace in a slow-debug build: > > ``` > $ slow-debug/jdk/bin/java -Xlog:async -Xlog:all=trace:file=asynctest::filecount=0 > $ grep 'dropped due to' asynctest > [0.147s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 1338 messages dropped due to async logging > [0.171s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?496 messages dropped due to async logging > [0.511s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ? 98 messages dropped due to async logging > [0.542s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ? 13 messages dropped due to async logging > [0.762s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 5620 messages dropped due to async logging > [1.286s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?12440 messages dropped due to async logging > [1.316s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2610 messages dropped due to async logging > [1.352s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2497 messages dropped due to async logging > ``` > > I'd like to suggest that we re-implement async logging using a producer/consumer protocol backed by circular buffers instead. This means that the output thread no longer suffers from starvation and can continously make progress. > It also means that we don't have to split up the buffer memory in two, both log threads and output thread can share the whole buffer at the same time. > > My solution does seem to improve the dropping situation, as per the same test: > > ``` > $ ul/jdk/bin/java -Xlog:async:drop -Xlog:all=trace:file=asyncdrop::filecount=0 > $ grep 'dropped due to' asyncdrop > [0.295s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?195 messages dropped due to async logging > [0.949s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3012 messages dropped due to async logging > [1.315s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 7992 messages dropped due to async logging > ``` > That is a bit over 10 000 dropped messages, as opposed to over 20 000 in the old solution. I also did some throughput benchmarking, which shows that stalling has minimal impact on throughput, see appendix 2 for the throughput timings. > > > The new protocol is a basic producer/consumer pattern with two mutexes and a circular buffer with two variables: head and tail. The consumer reads tail and sets head, the producer reads head and writes to tail. In this case the producer are the log sites and the consumer is the output thread. See appendix 1 for a pseudo-code description of the protocol. There are some details ommitted, such as how stalling is implemented (hint: wait/notify on the producer lock), please read the code for details. > > TL;DR: > > 1. Add a 'stall' strategy for the async logging so that we can guarantee 0 dropped messages at the expense of delays at log sites, this is exposed by changing the command line option. Before: -Xlog:async, after: -Xlog:async[:[drop|stall]]. Default is still drop. > 2. Ping-pong buffers are not a good idea when also supporting stalling, because a log site can be stalled for far longer than in the synchronous case. > 3. Therefore switch out ping-pong buffers and single mutex with a circular buffer and separate producer and consumer mutexes. This also has the benefit of reducing dropped messages in my testing. > > Thank you for reading and I appreciate any input on this. > Johan Sj?l?n > > > Appendix 0, the old async protocol. > > Nota bene: Any absence or presence of bugs may be the opposite in the actual source code. > > ``` > Variables: > Mutex m; > Buffer* cb; > Buffer a; > Buffer b; > const char* message; > > Methods: > void Buffer::write(const char* message) { > ? if (has_space(strlen(message))) { > ? ? internal_write(message); > ? } > } > > LogSite: > 1. m.lock(); > 2. cb->write(message); > 3. m.unlock(); > 4. m.notify(); > > OutputThread: > 1. m.wait(); > 2. m.lock(); > 3. buffer := cb; > 4. if cb == &a then cb = &b; else cb = &a; > 5. m.unlock(); > 6. for each message in buffer: print to log site > 7. go to 1. > ``` > > Appendix 1, the new protocol. > > Nota bene: Any absence or presence of bugs may be the opposite in the actual source code. > ``` > Mutex consumer; > Mutex producer; > CircularBuffer cb; > size_t head; > size_t tail; > const char* message; > char* out; > > LogSite: > 1. producer.lock(); > 2. t = tail; h = head; // Read the values > 2. if unused_memory(h, t) < strlen(message) then drop_or_stall(); > 3. size_t bytes_written = cb.write(message); // Writes string and envelope > 4. tail = (t + bytes_written) % cb.size(); // Message ready to consume, move tail. > 5. producer.unlock(); > > OutputThread: > 1. consumer.lock(); > 2. t = tail; h = head; > 3. if h == t return; // No message available > 4. size_t bytes_read = cb.read_bytes(out, h) // Write the next message > 5. head = (h + bytes_read) % cb.size() // More memory available for producer > 6. consumer.unlock(); > ``` > > Appendix 2, the throughput tests. > > ``` > Testing requires the tool pv, example of how to try it yourself: > $ ./build/slow/jdk/bin/java -Xlog:async -Xlog:all=trace | (pv --rate --timer --bytes > /dev/null) > > # Slow debug build > ## Default sync logging > ### /dev/null > 48.5MiB 0:00:02 [17.5MiB/s] > ### To file > 48.5MiB 0:00:03 [14.4MiB/s] > > ## Old async logging > ### /dev/null > 51.1MiB 0:00:02 [20.9MiB/s] > ### To file > 57.1MiB 0:00:02 [19.7MiB/s] > Dropped messages: > [0.202s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?350 messages dropped due to async logging > [1.302s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 1981 messages dropped due to async logging > [1.328s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2193 messages dropped due to async logging > [1.354s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2464 messages dropped due to async logging > [1.379s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4158 messages dropped due to async logging > [1.405s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3017 messages dropped due to async logging > [1.435s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3818 messages dropped due to async logging > [1.479s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2286 messages dropped due to async logging > > ## New async logging (dropping mode) > ### /dev/null > 50.6MiB 0:00:02 [20.9MiB/s] > ### To file > 50.8MiB 0:00:02 [18.1MiB/s] > Dropped messages: 0 on this run. > Did another run as there is high variability in dropped messages: > 50.2MiB 0:00:02 [18.3MiB/s] > [1.276s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4942 messages dropped due to async logging > > ## New async logging (stalling mode) > ### /dev/null > 51.3MiB 0:00:02 [19.9MiB/s] > ### To file > 50.9MiB 0:00:02 [18.8MiB/s] > > # Release build Linux-x64 (to a tmpfs mounted file) > ## New async drop > 20,1MiB 0:00:00 [43,3MiB/s] > 0 dropped messages > ## New async stall > 33,7MiB 0:00:00 [41,5MiB/s] > N/A dropped messages > 0 dropped messages > ## Old async > 17,7MiB 0:00:00 [42,2MiB/s] > [0.059s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 6860 messages dropped due to async logging > [0.082s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?20601 messages dropped due to async logging > [0,107s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 9147 messages dropped due to async logging > [0,131s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?11113 messages dropped due to async logging > [0,154s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?11845 messages dropped due to async logging > [0,178s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10383 messages dropped due to async logging > [0,197s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 9376 messages dropped due to async logging > [0,216s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10997 messages dropped due to async logging > [0,234s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 5978 messages dropped due to async logging > [0,252s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10439 messages dropped due to async logging > [0,270s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 6562 messages dropped due to async logging > [0,325s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?391 messages dropped due to async logging > [0,343s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3532 messages dropped due to async logging > [0,362s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3268 messages dropped due to async logging > [0,395s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4182 messages dropped due to async logging > [0,411s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2490 messages dropped due to async logging > ``` > From aturbanov at openjdk.org Tue Mar 19 09:14:22 2024 From: aturbanov at openjdk.org (Andrey Turbanov) Date: Tue, 19 Mar 2024 09:14:22 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Sat, 16 Mar 2024 22:35:47 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well test/micro/org/openjdk/bench/vm/lang/IntfSubtype.java line 30: > 28: @State(Scope.Thread) > 29: public class IntfSubtype { > 30: interface J {} Suggestion: interface J {} test/micro/org/openjdk/bench/vm/lang/TypePollution.java line 97: > 95: for (int i = 0; i < objectArray.length; i++) { > 96: Set> aSet = new HashSet>(); > 97: for (int j = 0; j < 6; j++) { Suggestion: for (int j = 0; j < 6; j++) { ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1529986634 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1529985760 From aph at openjdk.org Tue Mar 19 09:33:23 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 09:33:23 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 04:37:41 GMT, David Holmes wrote: > That said we have had a lot of people looking at the overall WX state management logic in the past week or so due to https://bugs.openjdk.org/browse/JDK-8327860. The workaround there requires us to be in EXEC mode That's very odd. The example there doesn't even involve MAP_JIT memory, so what does it have to do with WX? > and there are a number of folk who are questioning why "we" chose WRITE mode as the default with a switch to EXEC, instead of EXEC as the default with a switch to WRITE. > But whichever way that goes I think the VM state transitions are the places to enforce that choice. Hmm. Changing WX at VM state transitions is a form of temporal coupling, a classic design smell that has caused problems for decades. It's causing problems for us now. Instead, could we tag code that needs one or the other, keep track of the current WX state in thread-local memory, and flip WX only when we know we need to? That'd (by definition) reduce the number of transitions to the minimum if we were through. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2006498201 From aph at openjdk.org Tue Mar 19 09:35:23 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 09:35:23 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 22:33:16 GMT, Doug Simon wrote: > Hi Andrew, we'd most likely want to port this to Graal. Just for my understanding, the current implementation of [secondary super cache lookup in Graal](https://github.com/oracle/graal/blob/0a471fc329ea2c8c18c05f9d83347dc1199c3ad9/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/replacements/InstanceOfSnippets.java#L169) will continue to work until we get around to doing the port right? Absolutely, yes. This is purely an add-on for now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2006505888 From aph at openjdk.org Tue Mar 19 09:42:22 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 09:42:22 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 02:42:14 GMT, Vladimir Ivanov wrote: > Thanks a lot for taking care of secondary supers redesign, Andrew! My pleasure. > Overall, I really like the balance you strike with the proposed approach. As I can see on the tests, it behaves very well for positive cases across the whole range of sizes (longest probe distance maxes at 4 on `org.openjdk.bench.vm.lang.IntfSubtype` microbenchmark). For negative cases it starts to quickly degrade after ~45 elements (I measured distance to the nearest zero in the bitmask) which looks like a good compromise for simplicity of hash function and single pass packing of the table. Yes, exactly. it's not at all difficult to fix the negative case so that it performs much better with higher numbers of secondaries, but it seemed pointless. > IMO it's important to cover reflection (non-constant) case and stubs. We might find that hashed lookup in the non-constant case, with a small number of secondaries, doesn't look much better than a short linear search. But OK, I'll try in a later patch. > Also, it would be nice to align the implementation across all execution modes. But it can be covered as follow-up enhancements. Yes, especially because the great virtue of this patch is that it's stand alone, and it can very easily be back ported. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2006531309 From aph at openjdk.org Tue Mar 19 10:04:21 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 10:04:21 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 09:33:03 GMT, Andrew Haley wrote: > > Hi Andrew, we'd most likely want to port this to Graal. Just for my understanding, the current implementation of [secondary super cache lookup in Graal](https://github.com/oracle/graal/blob/0a471fc329ea2c8c18c05f9d83347dc1199c3ad9/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/replacements/InstanceOfSnippets.java#L169) will continue to work until we get around to doing the port right? > > Absolutely, yes. This is purely an add-on for now. To clarify: I can't see anything in there that is sensitive to the order in which the secondary supers appear, and unless code looks at the hash bitmap, the only change it sees is the order of the secondaries. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2006618543 From tschatzl at openjdk.org Tue Mar 19 10:23:46 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 19 Mar 2024 10:23:46 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v6] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. > > The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. > > This also moves some code that is about marking decisions out of `HeapRegion`. > > In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. > I did not want to change this here. > > This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). > > Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. > > Testing: tier1-4, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: iwalulya review - restore comment position ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18150/files - new: https://git.openjdk.org/jdk/pull/18150/files/fae12ca0..22d0407c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18150&range=04-05 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18150/head:pull/18150 PR: https://git.openjdk.org/jdk/pull/18150 From tschatzl at openjdk.org Tue Mar 19 10:29:21 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 19 Mar 2024 10:29:21 GMT Subject: RFR: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 11:45:22 GMT, Albert Mingkun Yang wrote: > Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. > > Test: tier1-6 Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18346#pullrequestreview-1945769088 From tschatzl at openjdk.org Tue Mar 19 10:34:27 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 19 Mar 2024 10:34:27 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v5] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 08:59:03 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review2 > > Marked as reviewed by ayang (Reviewer). Thanks @albertnetymk @walulyai for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/18150#issuecomment-2006743327 From tschatzl at openjdk.org Tue Mar 19 10:34:28 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 19 Mar 2024 10:34:28 GMT Subject: RFR: 8289822: G1: Make concurrent mark code owner of TAMSes [v5] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 15:27:29 GMT, Ivan Walulya wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review2 > > src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1508: > >> 1506: // * Reference discovery will not need a barrier. >> 1507: >> 1508: // Concurrent Mark ref processor > > The comment now seems misplaced Moved back the comment to the original position. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18150#discussion_r1530107231 From tschatzl at openjdk.org Tue Mar 19 10:34:28 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 19 Mar 2024 10:34:28 GMT Subject: Integrated: 8289822: G1: Make concurrent mark code owner of TAMSes In-Reply-To: References: Message-ID: On Thu, 7 Mar 2024 10:31:27 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that moves TAMS ownership and handling out of `HeapRegion` into `G1ConcurrentMark`. > > The rationale is that it's intrinsically a data structure only used by the marking, and only interesting during marking, so attaching it to the always present `HeapRegion` isn't fitting. > > This also moves some code that is about marking decisions out of `HeapRegion`. > > In this change the TAMSes are still maintained and kept up to date as long as the `HeapRegion` exists (basically always) as the snapshotting of the marking does not really snapshot ("copy over the `HeapRegion`s that existed at the time of the snapshot, only ever touching them), but assumes that they exist even for uncommitted (or free) regions due to the way it advances the global `finger` pointer. > I did not want to change this here. > > This is also why `HeapRegion` still tells `G1ConcurrentMark` to update the TAMS in two places (when clearing a region and when resetting during full gc). > > Another option I considered when implementing this is clearing all marking statistics (`G1ConcurrentMark::clear_statistics`) at these places in `HeapRegion` because it decreases the amount of places where this needs to be done (maybe one or the other place isn't really necessary already). I rejected it because this would add some work that is dependent on the number of worker threads (particularly) into `HeapRegion::hr_clear()`. This change can be looked at by viewing at all but the last commit. > > Testing: tier1-4, gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: f1c69cca Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/f1c69ccadb83306d1bb4860ff460a253af99643c Stats: 201 lines in 14 files changed: 84 ins; 68 del; 49 mod 8289822: G1: Make concurrent mark code owner of TAMSes Reviewed-by: ayang, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/18150 From johan.sjolen at oracle.com Tue Mar 19 12:05:00 2024 From: johan.sjolen at oracle.com (=?UTF-8?B?Sm9oYW4gU2rDtmzDqW4=?=) Date: Tue, 19 Mar 2024 13:05:00 +0100 Subject: Request for Comment: Add a stalling mode to asynchronous UL In-Reply-To: References: Message-ID: Hi Xin, Thank you for taking the time to perform your own tests on my code, I'm happy that the results were so encouraging. David explained the point of stalling mode very well, we will only stall in the case when the buffer is full, otherwise we are effectively non-blocking. I will be preparing a PR with ticket number 8323807, however I first need to take my branch from prototype to "production" quality. All the best, Johan ------------------------------------------------------------------------ *From:* Liu, Xin [mailto:xxinliu at amazon.com] *Sent:* Tuesday, March 19, 2024 at 05:30 *To:* Thomas St?fe; Johan Sjolen *Cc:* hotspot-dev at openjdk.org *Subject:* Request for Comment: Add a stalling mode to asynchronous UL > Hi, Johan, > > I don?t quite understand the new feature ?stalling?. On the > other side, I think your new design is very reasonable. > > Gil Tene dubbed the time that Java applications do nothing > ?hiccups?. From those services which are keen on GC pause in > high-percentile such as P99.999, we found that one source of hiccups > is unified logging at safepoints. There are many reasons unified logs > are blocking. It is possible that log writes are throttled by the > write bandwidth of disk. I think the main motivation of -Xlog:async is > non-blocking, not leveraging multi-core. When log write is blocked, it > should not block Java threads or VMThread. In your stall mode, my > understanding is that your log site is still blocking. You have to wait for > AsyncLogWriter when it is trapped in ?msg.output->write_blocking?, > right? If we would stall rather than drop messages , why do we still choose > async-logging in the first place? > > On the other side, the circular buffer looks promising. I compare your > change(pull/17757) and its base(71b46c38a82) using your cmd > "slow-debug/jdk/bin/java -Xlog:async". > > For stdout, I use "slowdebug/jdk/bin/java -Xlog:async -Xlog:all=trace > |& tee stdout-$i.log". > > I ran 20 iterations and compare their differences using > T-test. I found that there are significant differences for both stdout/stderr > and regular file. For stdout, the mean of dropped message reduces > 2.9%. For file, the mean of dropped message reduces 86.8%. > > I see that your approach is much more effective than ping-pong buffer > when IO write is fast. Write a file is faster than stdout/stderr. Shall you post > PR for broader reviews. > > Thanks, > --lx > > > From: Thomas St?fe > Date: Friday, February 16, 2024 at 6:56 AM > To: Johan Sjolen > Cc: "hotspot-dev at openjdk.org" , "Liu, Xin" > Subject: RE: [EXTERNAL] Request for Comment: Add a stalling mode to asynchronous UL > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > CC'ing Xin as original author. > > At first glance, I like this. I also like the ringbuffer approach better. Not sure if it had been discussed originally, but the original PR went on for a while IIRC. > > Do we even need the "drop" mode? I never liked the idea of losing output, and maybe stalls are that rare that they won't matter. Arguably, if you stall you should increase the log buffer size. > > ..Thomas > > > On Fri, Feb 16, 2024 at 3:45?PM Johan Sjolen wrote: > Hello Hotspot, > > Today, asynchronous UL depends on off-loading the work of logging to output devices by having log sites copying messages onto a buffer and an output thread swapping it with an empty buffer and emptying it. > If the current buffer runs out of room, then any attempt at logging leads to the message being dropped, that is: The message is not printed at all. > I'd like to suggest that we add a stalling mode to async UL where in no message dropping occurs and all log sites stall until there is sufficient room in the buffer for their respective message. > > My purpose with this e-mail is twofold. First, I would like to know whether there is any interest in such a feature being added to Hotspot. Second, I'd like to discuss a change in the design of async UL which will permit this feature and other improvements. > I imagine that the user facing change for async UL would be to equip -Xlog:async with an option, as such: > > ``` > $ java -Xlog:async:drop ?... # Drop messages > $ java -Xlog:async:stall ... # Stall log sites > $ java -Xlog:async ? ? ? ... # Drop messages by default > ``` > > The rest of this e-mail will describe the design changes and motivation behind these changes, so if you're mostly interested in answering whether or not you want there to be a stalling mode in async UL then you don't need to bother with the rest ?. > > There's a draft PR for these changes, it runs on my Linux computer, but I have implemented the necessary changes for Mac and Windows also. The PR is available here: https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/17757__;!!ACWV5N9M2RV99hQ!MdM5-wtW55eXb62nY7oNPVi5xR_eyStKhV9pUhhFk0yDRJG7ORpQCxzhntPXxwVz77AVK_5X1TJmtmSci0M$ > > I have described the current async protocol which is in mainline in appendix 0. There are two issues with this protocol: > > 1. During times when there is a high amount of logging in the system there is also high contention on the mutex. This can lead the output thread to suffer from starvation, as it cannot take the mutex to swap the buffers. > ? ?This in turn leads to excessive dropping of messages and poor scaling of buffer increases. > 2. Implementing stalling (as opposed to dropping messages) under this protocol leads to unnecessary waiting on the logsites. > ? ?Stalling would mean that a log site has to wait for the entire log buffer to be emptied before logging. > ? ?This may mean 1MiB of data having to be output before the swap can occur, and of course the swap requires the writing thread to acquire the mutex which has its own issues. > > Issue number 1 can most easily be observed by inducing a (perhaps artificially so) high amount of logging through setting logging to all=trace in a slow-debug build: > > ``` > $ slow-debug/jdk/bin/java -Xlog:async -Xlog:all=trace:file=asynctest::filecount=0 > $ grep 'dropped due to' asynctest > [0.147s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 1338 messages dropped due to async logging > [0.171s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?496 messages dropped due to async logging > [0.511s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ? 98 messages dropped due to async logging > [0.542s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ? 13 messages dropped due to async logging > [0.762s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 5620 messages dropped due to async logging > [1.286s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?12440 messages dropped due to async logging > [1.316s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2610 messages dropped due to async logging > [1.352s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2497 messages dropped due to async logging > ``` > > I'd like to suggest that we re-implement async logging using a producer/consumer protocol backed by circular buffers instead. This means that the output thread no longer suffers from starvation and can continously make progress. > It also means that we don't have to split up the buffer memory in two, both log threads and output thread can share the whole buffer at the same time. > > My solution does seem to improve the dropping situation, as per the same test: > > ``` > $ ul/jdk/bin/java -Xlog:async:drop -Xlog:all=trace:file=asyncdrop::filecount=0 > $ grep 'dropped due to' asyncdrop > [0.295s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?195 messages dropped due to async logging > [0.949s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3012 messages dropped due to async logging > [1.315s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 7992 messages dropped due to async logging > ``` > That is a bit over 10 000 dropped messages, as opposed to over 20 000 in the old solution. I also did some throughput benchmarking, which shows that stalling has minimal impact on throughput, see appendix 2 for the throughput timings. > > > The new protocol is a basic producer/consumer pattern with two mutexes and a circular buffer with two variables: head and tail. The consumer reads tail and sets head, the producer reads head and writes to tail. In this case the producer are the log sites and the consumer is the output thread. See appendix 1 for a pseudo-code description of the protocol. There are some details ommitted, such as how stalling is implemented (hint: wait/notify on the producer lock), please read the code for details. > > TL;DR: > > 1. Add a 'stall' strategy for the async logging so that we can guarantee 0 dropped messages at the expense of delays at log sites, this is exposed by changing the command line option. Before: -Xlog:async, after: -Xlog:async[:[drop|stall]]. Default is still drop. > 2. Ping-pong buffers are not a good idea when also supporting stalling, because a log site can be stalled for far longer than in the synchronous case. > 3. Therefore switch out ping-pong buffers and single mutex with a circular buffer and separate producer and consumer mutexes. This also has the benefit of reducing dropped messages in my testing. > > Thank you for reading and I appreciate any input on this. > Johan Sj?l?n > > > Appendix 0, the old async protocol. > > Nota bene: Any absence or presence of bugs may be the opposite in the actual source code. > > ``` > Variables: > Mutex m; > Buffer* cb; > Buffer a; > Buffer b; > const char* message; > > Methods: > void Buffer::write(const char* message) { > ? if (has_space(strlen(message))) { > ? ? internal_write(message); > ? } > } > > LogSite: > 1. m.lock(); > 2. cb->write(message); > 3. m.unlock(); > 4. m.notify(); > > OutputThread: > 1. m.wait(); > 2. m.lock(); > 3. buffer := cb; > 4. if cb == &a then cb = &b; else cb = &a; > 5. m.unlock(); > 6. for each message in buffer: print to log site > 7. go to 1. > ``` > > Appendix 1, the new protocol. > > Nota bene: Any absence or presence of bugs may be the opposite in the actual source code. > ``` > Mutex consumer; > Mutex producer; > CircularBuffer cb; > size_t head; > size_t tail; > const char* message; > char* out; > > LogSite: > 1. producer.lock(); > 2. t = tail; h = head; // Read the values > 2. if unused_memory(h, t) < strlen(message) then drop_or_stall(); > 3. size_t bytes_written = cb.write(message); // Writes string and envelope > 4. tail = (t + bytes_written) % cb.size(); // Message ready to consume, move tail. > 5. producer.unlock(); > > OutputThread: > 1. consumer.lock(); > 2. t = tail; h = head; > 3. if h == t return; // No message available > 4. size_t bytes_read = cb.read_bytes(out, h) // Write the next message > 5. head = (h + bytes_read) % cb.size() // More memory available for producer > 6. consumer.unlock(); > ``` > > Appendix 2, the throughput tests. > > ``` > Testing requires the tool pv, example of how to try it yourself: > $ ./build/slow/jdk/bin/java -Xlog:async -Xlog:all=trace | (pv --rate --timer --bytes > /dev/null) > > # Slow debug build > ## Default sync logging > ### /dev/null > 48.5MiB 0:00:02 [17.5MiB/s] > ### To file > 48.5MiB 0:00:03 [14.4MiB/s] > > ## Old async logging > ### /dev/null > 51.1MiB 0:00:02 [20.9MiB/s] > ### To file > 57.1MiB 0:00:02 [19.7MiB/s] > Dropped messages: > [0.202s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?350 messages dropped due to async logging > [1.302s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 1981 messages dropped due to async logging > [1.328s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2193 messages dropped due to async logging > [1.354s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2464 messages dropped due to async logging > [1.379s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4158 messages dropped due to async logging > [1.405s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3017 messages dropped due to async logging > [1.435s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3818 messages dropped due to async logging > [1.479s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2286 messages dropped due to async logging > > ## New async logging (dropping mode) > ### /dev/null > 50.6MiB 0:00:02 [20.9MiB/s] > ### To file > 50.8MiB 0:00:02 [18.1MiB/s] > Dropped messages: 0 on this run. > Did another run as there is high variability in dropped messages: > 50.2MiB 0:00:02 [18.3MiB/s] > [1.276s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4942 messages dropped due to async logging > > ## New async logging (stalling mode) > ### /dev/null > 51.3MiB 0:00:02 [19.9MiB/s] > ### To file > 50.9MiB 0:00:02 [18.8MiB/s] > > # Release build Linux-x64 (to a tmpfs mounted file) > ## New async drop > 20,1MiB 0:00:00 [43,3MiB/s] > 0 dropped messages > ## New async stall > 33,7MiB 0:00:00 [41,5MiB/s] > N/A dropped messages > 0 dropped messages > ## Old async > 17,7MiB 0:00:00 [42,2MiB/s] > [0.059s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 6860 messages dropped due to async logging > [0.082s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?20601 messages dropped due to async logging > [0,107s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 9147 messages dropped due to async logging > [0,131s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?11113 messages dropped due to async logging > [0,154s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?11845 messages dropped due to async logging > [0,178s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10383 messages dropped due to async logging > [0,197s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 9376 messages dropped due to async logging > [0,216s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10997 messages dropped due to async logging > [0,234s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 5978 messages dropped due to async logging > [0,252s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ?10439 messages dropped due to async logging > [0,270s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 6562 messages dropped due to async logging > [0,325s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? ?391 messages dropped due to async logging > [0,343s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3532 messages dropped due to async logging > [0,362s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 3268 messages dropped due to async logging > [0,395s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 4182 messages dropped due to async logging > [0,411s][warning][ ? ? ? ? ? ? ? ? ? ? ? ] ? 2490 messages dropped due to async logging > ``` > From ayang at openjdk.org Tue Mar 19 12:11:40 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 19 Mar 2024 12:11:40 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v2] In-Reply-To: References: Message-ID: > Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. > > Test: tier1-3 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: pgc-dead-ratio ------------- Changes: https://git.openjdk.org/jdk/pull/18278/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18278&range=01 Stats: 244 lines in 7 files changed: 8 ins; 219 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/18278.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18278/head:pull/18278 PR: https://git.openjdk.org/jdk/pull/18278 From dholmes at openjdk.org Tue Mar 19 12:20:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Mar 2024 12:20:21 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 09:30:52 GMT, Andrew Haley wrote: > That's very odd. The example there doesn't even involve MAP_JIT memory, so what does it have to do with WX? @theRealAph that is the mystery we hope will be resolved once we know the nature of the underlying OS bug. Somehow switching to exec mode fixes/works-around the issue. I can imagine a missing conditional to check if the region is MAP_JIT. > Changing WX at VM state transitions is a form of temporal coupling, a classic design smell that has caused problems for decades. The original introducers of WXEnable made the decision that the VM should be in WRITE mode unless it needs EXEC. That is the state we are presently trying to achieve with this change. If that original design choice is wrong then ... > Instead, could we tag code that needs one or the other, keep track of the current WX state in thread-local memory, and flip WX only when we know we need to? And I've asked about this every time a missing WXEnable has had to be added. We seem to be generically able to describe what kind of code needs which mode, but we seem to struggle to pin it down. Though that is what https://bugs.openjdk.org/browse/JDK-8307817 is looking at doing. > That'd (by definition) reduce the number of transitions to the minimum if we were through. Not necessarily. It may well remove some transitions from paths that don't need it, but if you move the state change too low down the call chain you could end up transitioning much more often in code that does need it e.g. if a transitioning method is called in a loop. We need to optimise the actual call paths as well as identify specific methods. But all this discussion suggests to me that this PR is not really worth pursuing at this time - IIUC no actual failures are observed other than those pertaining to `AssertWXAtThreadSync` and that flag will be gone if we do decide to be more fine-grained about WX management. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007029830 From mbaesken at openjdk.org Tue Mar 19 13:22:22 2024 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 19 Mar 2024 13:22:22 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes wrote: > IIUC no actual failures are observed other than those pertaining to AssertWXAtThreadSync We saw sporadic crashes in our jtreg (maybe also jck?) runs; only **_later_** we enabled AssertWXAtThreadSync for more checking. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007159560 From sjayagond at openjdk.org Tue Mar 19 13:38:33 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Tue, 19 Mar 2024 13:38:33 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: Message-ID: > This PR Adds SIMD support on s390x. Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: Use Op_VecX instead of Op_RegF Signed-off-by: Sidraya ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18162/files - new: https://git.openjdk.org/jdk/pull/18162/files/487c57db..3caa470c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=02-03 Stats: 265 lines in 8 files changed: 1 ins; 155 del; 109 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From eosterlund at openjdk.org Tue Mar 19 13:44:28 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 19 Mar 2024 13:44:28 GMT Subject: RFR: 8328507: Move StackWatermark code from safepoint cleanup Message-ID: There is some stack watermark code in safepoint cleanup that flushes out concurrent stack processing, as a defensive measure to guard safepoint operations that assume the stacks have been fixed up. However, we wish to get rid of safepoint cleanup so this operation should move out from the safepoint. This proposal moves it into CollectedHeap::safepoint_synchronize_begin instead which runs before we start shutting down Java threads. This moves the code into more obvious GC places and reduces time spent inside of non-GC safepoints. ------------- Commit messages: - 8328507: Move StackWatermark code from safepoint cleanup Changes: https://git.openjdk.org/jdk/pull/18378/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18378&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328507 Stats: 75 lines in 11 files changed: 23 ins; 44 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/18378.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18378/head:pull/18378 PR: https://git.openjdk.org/jdk/pull/18378 From gli at openjdk.org Tue Mar 19 13:48:23 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 19 Mar 2024 13:48:23 GMT Subject: RFR: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 11:45:22 GMT, Albert Mingkun Yang wrote: > Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. > > Test: tier1-6 Looks good. The copyrights of these four files need to be adjusted. ------------- Marked as reviewed by gli (Committer). PR Review: https://git.openjdk.org/jdk/pull/18346#pullrequestreview-1946277985 From aboldtch at openjdk.org Tue Mar 19 13:50:20 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 19 Mar 2024 13:50:20 GMT Subject: RFR: 8328507: Move StackWatermark code from safepoint cleanup In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 13:30:39 GMT, Erik ?sterlund wrote: > There is some stack watermark code in safepoint cleanup that flushes out concurrent stack processing, as a defensive measure to guard safepoint operations that assume the stacks have been fixed up. However, we wish to get rid of safepoint cleanup so this operation should move out from the safepoint. This proposal moves it into CollectedHeap::safepoint_synchronize_begin instead which runs before we start shutting down Java threads. This moves the code into more obvious GC places and reduces time spent inside of non-GC safepoints. lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18378#pullrequestreview-1946285372 From ayang at openjdk.org Tue Mar 19 14:05:42 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 19 Mar 2024 14:05:42 GMT Subject: RFR: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY [v2] In-Reply-To: References: Message-ID: <7zNLvOMdYc1vmoKzb7wSgZOaugZixZQwInKpOWDZJF0=.3da8fc5a-879a-42d3-b7e8-ccc5949679b4@github.com> > Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. > > Test: tier1-6 Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - year - g1-remove-DO_DISCOVERED_AND_DISCOVERY ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18346/files - new: https://git.openjdk.org/jdk/pull/18346/files/5dfd1060..9ef446b4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18346&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18346&range=00-01 Stats: 4520 lines in 216 files changed: 1630 ins; 1794 del; 1096 mod Patch: https://git.openjdk.org/jdk/pull/18346.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18346/head:pull/18346 PR: https://git.openjdk.org/jdk/pull/18346 From gli at openjdk.org Tue Mar 19 14:10:25 2024 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 19 Mar 2024 14:10:25 GMT Subject: RFR: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY [v2] In-Reply-To: <7zNLvOMdYc1vmoKzb7wSgZOaugZixZQwInKpOWDZJF0=.3da8fc5a-879a-42d3-b7e8-ccc5949679b4@github.com> References: <7zNLvOMdYc1vmoKzb7wSgZOaugZixZQwInKpOWDZJF0=.3da8fc5a-879a-42d3-b7e8-ccc5949679b4@github.com> Message-ID: On Tue, 19 Mar 2024 14:05:42 GMT, Albert Mingkun Yang wrote: >> Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - year > - g1-remove-DO_DISCOVERED_AND_DISCOVERY Marked as reviewed by gli (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18346#pullrequestreview-1946347752 From eosterlund at openjdk.org Tue Mar 19 14:41:33 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 19 Mar 2024 14:41:33 GMT Subject: RFR: 8328507: Move StackWatermark code from safepoint cleanup [v2] In-Reply-To: References: Message-ID: > There is some stack watermark code in safepoint cleanup that flushes out concurrent stack processing, as a defensive measure to guard safepoint operations that assume the stacks have been fixed up. However, we wish to get rid of safepoint cleanup so this operation should move out from the safepoint. This proposal moves it into CollectedHeap::safepoint_synchronize_begin instead which runs before we start shutting down Java threads. This moves the code into more obvious GC places and reduces time spent inside of non-GC safepoints. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Remove incorrect assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18378/files - new: https://git.openjdk.org/jdk/pull/18378/files/f18381c8..e830146a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18378&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18378&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18378.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18378/head:pull/18378 PR: https://git.openjdk.org/jdk/pull/18378 From duke at openjdk.org Tue Mar 19 15:02:32 2024 From: duke at openjdk.org (Volodymyr Paprotski) Date: Tue, 19 Mar 2024 15:02:32 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore [v2] In-Reply-To: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> References: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Message-ID: > Replace vpblendvp[sd] with macro assembler call and test in: > - `C2_MacroAssembler::vector_cast_float_to_int_special_cases_avx` (insufficient registers for 1 of 2 blends) > - `C2_MacroAssembler::vector_cast_double_to_int_special_cases_avx` > - `C2_MacroAssembler::vector_count_leading_zeros_int_avx` > > Functional testing with existing and new tests: > `make test TEST="test/hotspot/jtreg/compiler/vectorapi/reshape test/hotspot/jtreg/compiler/vectorization/runner/BasicIntOpTest.java"` > > Benchmarking with existing and new tests: > > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microFloat256ToInteger256" > make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microDouble256ToInteger256" > make test TEST="micro:org.openjdk.bench.vm.compiler.VectorBitCount.WithSuperword.intLeadingZeroCount" > > > Performance before: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 17271.078 ? 184.140 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9310.507 ? 88.136 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11137.594 ? 19.009 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 5425.001 ? 3.136 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 0.994 ? 0.002 ops/us > > > Performance after: > > Benchmark (SIZE) Mode Cnt Score Error Units > VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 19222.048 ? 87.622 ops/ms > VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9233.245 ? 123.493 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11672.806 ? 10.854 ops/ms > VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 6009.735 ? 12.173 ops/ms > VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 1.039 ? 0.004 ops/us Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: Fix double pasted test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18310/files - new: https://git.openjdk.org/jdk/pull/18310/files/af21bfc4..9430d88e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18310&range=00-01 Stats: 19 lines in 1 file changed: 0 ins; 19 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18310.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18310/head:pull/18310 PR: https://git.openjdk.org/jdk/pull/18310 From stuefe at openjdk.org Tue Mar 19 15:20:23 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Mar 2024 15:20:23 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes wrote: >> Instead, could we tag code that needs one or the other, keep track of the current WX state in thread-local memory, and flip WX only when we know we need to? The first part we already do. I wonder wheter we could - at least as workaround for if we missed a spot - do wx switching as a reaction to a SIBBUS related to WX violation in code cache. Switch state around, return from signal handler and retry operation. (Edit: tested it, does not seem to work. I guess when the SIGBUS is triggered in the kernel thread WX state had already been processed somehow). > > That's very odd. The example there doesn't even involve MAP_JIT memory, so what does it have to do with WX? > > @theRealAph that is the mystery we hope will be resolved once we know the nature of the underlying OS bug. Somehow switching to exec mode fixes/works-around the issue. I can imagine a missing conditional to check if the region is MAP_JIT. > > > Changing WX at VM state transitions is a form of temporal coupling, a classic design smell that has caused problems for decades. > > The original introducers of WXEnable made the decision that the VM should be in WRITE mode unless it needs EXEC. That is the state we are presently trying to achieve with this change. If that original design choice is wrong then ... > > > Instead, could we tag code that needs one or the other, keep track of the current WX state in thread-local memory, and flip WX only when we know we need to? > > And I've asked about this every time a missing WXEnable has had to be added. We seem to be generically able to describe what kind of code needs which mode, but we seem to struggle to pin it down. Though that is what https://bugs.openjdk.org/browse/JDK-8307817 is looking at doing. > > > That'd (by definition) reduce the number of transitions to the minimum if we were through. > > Not necessarily. It may well remove some transitions from paths that don't need it, but if you move the state change too low down the call chain you could end up transitioning much more often in code that does need it e.g. if a transitioning method is called in a loop. Not if you do the switching lazily. The first iteration would switch to the needed state; subsequent iterations would not do anything since the state already matches. Unless you interleave writes and execs, but then you would need the state changes anyway. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007469410 From mdoerr at openjdk.org Tue Mar 19 15:20:26 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 19 Mar 2024 15:20:26 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 13:38:33 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Use Op_VecX instead of Op_RegF > > Signed-off-by: Sidraya This looks much better! I have some minor cleanup suggestions. I didn't review the s390 instructions and their usage. That should be done by somebody else, maybe @offamitkumar? src/hotspot/cpu/s390/assembler_s390.hpp line 2536: > 2534: inline void z_vlvgh( VectorRegister v1, Register r3, int64_t d2, Register b2=Z_R0); > 2535: inline void z_vlvgf( VectorRegister v1, Register r3, int64_t d2, Register b2=Z_R0); > 2536: inline void z_vlvgg( VectorRegister v1, Register r3, int64_t d2, Register b2=Z_R0); Common coding style: `b2 = Z_R0` with spaces. src/hotspot/cpu/s390/globals_s390.hpp line 114: > 112: product(bool, SuperwordUseVX, false, \ > 113: "Use Z15 Vector instructions for superword optimization.") \ > 114: product(bool, UseSFPV, false, DIAGNOSTIC, \ Please adapt the whitespaces before ``. src/hotspot/cpu/s390/s390.ad line 1081: > 1079: } > 1080: > 1081: // we have 32 vector register * 4 halves Seems like this comment needs an update. src/hotspot/cpu/s390/s390.ad line 10957: > 10955: ins_encode( /*empty*/ ); > 10956: ins_pipe(pipe_class_dummy); > 10957: %} Better add an empty line. src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 314: > 312: > 313: // Calculate frame size. > 314: const int vregstosave_num = save_vectors ? (sizeof(RegisterSaver_LiveVRegs) / Indentation. src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 487: > 485: void RegisterSaver::restore_live_registers(MacroAssembler* masm, RegisterSet reg_set, bool save_vectors) { > 486: int offset; > 487: //const int register_save_offset = live_reg_frame_size(reg_set) - live_reg_save_size(reg_set); Do we need to keep this commented code? ------------- PR Review: https://git.openjdk.org/jdk/pull/18162#pullrequestreview-1946565035 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530585064 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530584154 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530587847 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530589851 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530590990 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530592534 From mdoerr at openjdk.org Tue Mar 19 15:24:23 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 19 Mar 2024 15:24:23 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 13:38:33 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Use Op_VecX instead of Op_RegF > > Signed-off-by: Sidraya src/hotspot/cpu/s390/s390.ad line 7473: > 7471: // Sqrt float precision > 7472: instruct sqrtF_reg(regF dst, regF src) %{ > 7473: match(Set dst (SqrtF src)); Did the old code cause problems? Required nodes not matched? Maybe this should get fixed in a separate PR for backports? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1530602580 From mli at openjdk.org Tue Mar 19 15:32:31 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Mar 2024 15:32:31 GMT Subject: RFR: 8328264: AArch64: remove UseNeon condition in CRC32 intrinsic [v2] In-Reply-To: References: Message-ID: <2DAqxD8lF30SiWupwBYddbturljsnMn7qRXio1DfsIw=.43d8dd42-8aa4-421e-b2f4-55a4436a910c@github.com> > Hi, > Can you review the simple patch? > Thanks > > FYI: Discussed https://github.com/openjdk/jdk/pull/18294#issuecomment-1997727704, this usage of UseNeon flag should be removed, as neon is working by default, and UseNeon could be false, so this means the intrinsic code of Neon can be skipped unexpected. > > ## Performance > Tested jmh `TestCRC32` with "-XX:-UseCRC32 -XX:-UseCryptoPmullForCRC32" > Before > > Benchmark (count) Mode Cnt Score Error Units > TestCRC32.testCRC32Update 64 avgt 2 53.696 ns/op > TestCRC32.testCRC32Update 128 avgt 2 104.942 ns/op > TestCRC32.testCRC32Update 256 avgt 2 207.147 ns/op > TestCRC32.testCRC32Update 512 avgt 2 411.179 ns/op > TestCRC32.testCRC32Update 2048 avgt 2 1608.388 ns/op > TestCRC32.testCRC32Update 16384 avgt 2 12763.513 ns/op > TestCRC32.testCRC32Update 65536 avgt 2 51024.246 ns/op > > > After > > Benchmark (count) Mode Cnt Score Error Units > TestCRC32.testCRC32Update 64 avgt 2 40.172 ns/op > TestCRC32.testCRC32Update 128 avgt 2 56.754 ns/op > TestCRC32.testCRC32Update 256 avgt 2 89.743 ns/op > TestCRC32.testCRC32Update 512 avgt 2 156.726 ns/op > TestCRC32.testCRC32Update 2048 avgt 2 579.776 ns/op > TestCRC32.testCRC32Update 16384 avgt 2 4624.023 ns/op > TestCRC32.testCRC32Update 65536 avgt 2 18505.180 ns/op Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: revert indents ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18328/files - new: https://git.openjdk.org/jdk/pull/18328/files/a5b520ee..6eca8b64 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18328&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18328&range=00-01 Stats: 226 lines in 1 file changed: 101 ins; 101 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/18328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18328/head:pull/18328 PR: https://git.openjdk.org/jdk/pull/18328 From bkilambi at openjdk.org Tue Mar 19 15:45:22 2024 From: bkilambi at openjdk.org (Bhavana Kilambi) Date: Tue, 19 Mar 2024 15:45:22 GMT Subject: RFR: 8328138: Optimize ArrayEquals on AArch64 & fix potential crash In-Reply-To: References: Message-ID: On Thu, 14 Mar 2024 06:21:56 GMT, Xiaowei Lu wrote: > Current implementation of ArrayEquals on AArch64 is quite complex, due to the variety of checks about alignment, tail processing, bus locking and so on. However, Modern Arm processors have eased such worries. Besides, we found crash when using lilliput. So we proposed to use a simple&straightforward flow of ArrayEquals. > With this simplified ArrayEquals, we observed performance gains on the latest arm platforms(Neoverse N1&N2) > Test case: org.openjdk.bench.java.util.ArraysEquals > > 1x vector length, 64-bit aligned array[0] > | Test Case | N1 | N2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | -21.42% | -13.37% | > | testByteFalseEnd | 25.79% | 27.45% | > | testByteFalseMid | 16.64% | 16.46% | > | testByteTrue | 12.39% | 24.66% | > | testCharFalseBeginning | -5.27% | -3.08% | > | testCharFalseEnd | 29.29% | 35.23% | > | testCharFalseMid | 15.13% | 19.34% | > | testCharTrue | 21.63% | 33.73% | > | Total | 11.77% | 17.55% | > > A key factor is to decide when we should utilize simd in array equals. An aggressive choice is to enable simd as long as array length exceeds vector length(8 words). The corresponding result is shown above, from which we can see performance regression in both testBeginning cases. To avoid such perf impact, we can set simd threshold to 3x vector length. > > 3x vector length, 64-bit aligned array[0] > | | n1 | n2 | > |:----------------------:|:---------:|:---------:| > | testByteFalseBeginning | 8.28% | 8.64% | > | testByteFalseEnd | 6.38% | 12.29% | > | testByteFalseMid | 6.17% | 7.96% | > | testByteTrue | -10.08% | 3.06% | > | testCharFalseBeginning | -1.42% | 7.23% | > | testCharFalseEnd | 4.05% | 13.48% | > | testCharFalseMid | 8.79% | 16.96% | > | testCharTrue | -5.66% | 10.23% | > | Total | 2.06% | 9.98% | > > > In addtion to perf improvement, we propose this patch to solve alignment issues in array equals. JDK-8139457 tries to relax alignment of array elements. On the other hand, this misalignment makes it an error to read the whole last word in array equals, in case that the array doesn't occupy the whole word and lilliput is enabled. A detailed explaination quoted from [https://github.com/openjdk/jdk/pull/11044#issuecomment-1996771480](url) > >> The root cause is that default behavior of MacroAss... Hi, this patch looks interesting. All the arrays in the testcase used to test performance of this patch have the same length. Have you tested your patch with short array lengths? I just modified your testcase with a char array of length = 7 and the performance on an aarch64 machine with -XX:+UseNewCode is ~17% worse compared to the default and ~20% worse compared to the case with -XX:+UseSimpleArrayEquals turned on. I agree that it's not possible to gain better performance for each and every testcase but a lot of real world applications do contain a considerable amount of short strings/array compares. With a main comparison loop which processes more elements (64 in your case), it could add more compare-branch instructions as well for a short array leading to more number of instructions executed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18292#issuecomment-2007529935 From dean.long at oracle.com Tue Mar 19 16:28:25 2024 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 19 Mar 2024 09:28:25 -0700 Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: <3d4fa4c7-dd7c-4698-b75a-291c347ce50d@oracle.com> On 3/19/24 8:20 AM, Thomas Stuefe wrote: > I wonder wheter we could - at least as workaround for if we missed a > spot - do wx switching as a reaction to a SIBBUS related to WX violation > in code cache. Switch state around, return from signal handler and > retry operation. > > (Edit: tested it, does not seem to work. I guess when the SIGBUS is > triggered in the kernel thread WX state had already been processed > somehow). That makes sense if the WX state is part of the signal context saved and restored by the signal handler. dl -------------- next part -------------- An HTML attachment was scrubbed... URL: From mli at openjdk.org Tue Mar 19 17:03:25 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Mar 2024 17:03:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes > If there was some kind of plan, with evidence of the intention to do something to get this valuable tech into people's hands in a form they can use, sure. But as you can tell, I think this may rot because no one will be able use it. If SLEEF were included in the JDK it'd be fine. if SLEEF were a common Linux library it'd be fine. > > The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. Please kindly correct me if I misunderstood your points. Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2007692746 From rrich at openjdk.org Tue Mar 19 17:11:23 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Tue, 19 Mar 2024 17:11:23 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes wrote: > But all this discussion suggests to me that this PR is not really worth pursuing at this time - IIUC no actual failures are observed other than those pertaining to AssertWXAtThreadSync and that flag will be gone if we do decide to be more fine-grained about WX management. I see it differently. This PR is just a simple attempt to get clean test runs with AssertWXAtThreadSync (after fixing an actual crash https://bugs.openjdk.org/browse/JDK-8327036). While the violating locations in this PR might be unlikely to produce actual crashes I think it is worthwhile to have clean testing with AssertWXAtThreadSync because this will help prevent regressions that are more likely. Beyond the trivial fixes of this PR I'm very much in favor of further enhancements as the aforementioned https://bugs.openjdk.org/browse/JDK-8307817. My recommendation would be to remove as much non-constant data from the code cache as possible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007709981 From aph at openjdk.org Tue Mar 19 17:14:22 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 17:14:22 GMT Subject: RFR: 8328264: AArch64: remove UseNeon condition in CRC32 intrinsic [v2] In-Reply-To: <2DAqxD8lF30SiWupwBYddbturljsnMn7qRXio1DfsIw=.43d8dd42-8aa4-421e-b2f4-55a4436a910c@github.com> References: <2DAqxD8lF30SiWupwBYddbturljsnMn7qRXio1DfsIw=.43d8dd42-8aa4-421e-b2f4-55a4436a910c@github.com> Message-ID: <8KvtBJOhasYf_Ws5w-RJXNzl2e0S_YJC_aZIz2yQ6h4=.f9a1ce0e-9b7b-497b-8462-ad90008228d1@github.com> On Tue, 19 Mar 2024 15:32:31 GMT, Hamlin Li wrote: >> Hi, >> Can you review the simple patch? >> Thanks >> >> FYI: Discussed https://github.com/openjdk/jdk/pull/18294#issuecomment-1997727704, this usage of UseNeon flag should be removed, as neon is working by default, and UseNeon could be false, so this means the intrinsic code of Neon can be skipped unexpected. >> >> ## Performance >> Tested jmh `TestCRC32` with "-XX:-UseCRC32 -XX:-UseCryptoPmullForCRC32" >> Before >> >> Benchmark (count) Mode Cnt Score Error Units >> TestCRC32.testCRC32Update 64 avgt 2 53.696 ns/op >> TestCRC32.testCRC32Update 128 avgt 2 104.942 ns/op >> TestCRC32.testCRC32Update 256 avgt 2 207.147 ns/op >> TestCRC32.testCRC32Update 512 avgt 2 411.179 ns/op >> TestCRC32.testCRC32Update 2048 avgt 2 1608.388 ns/op >> TestCRC32.testCRC32Update 16384 avgt 2 12763.513 ns/op >> TestCRC32.testCRC32Update 65536 avgt 2 51024.246 ns/op >> >> >> After >> >> Benchmark (count) Mode Cnt Score Error Units >> TestCRC32.testCRC32Update 64 avgt 2 40.172 ns/op >> TestCRC32.testCRC32Update 128 avgt 2 56.754 ns/op >> TestCRC32.testCRC32Update 256 avgt 2 89.743 ns/op >> TestCRC32.testCRC32Update 512 avgt 2 156.726 ns/op >> TestCRC32.testCRC32Update 2048 avgt 2 579.776 ns/op >> TestCRC32.testCRC32Update 16384 avgt 2 4624.023 ns/op >> TestCRC32.testCRC32Update 65536 avgt 2 18505.180 ns/op > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > revert indents OK. Trivial and good. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18328#pullrequestreview-1946885966 From aph at openjdk.org Tue Mar 19 17:18:24 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 17:18:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> Message-ID: On Tue, 19 Mar 2024 17:00:48 GMT, Hamlin Li wrote: > > The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. > > Please kindly correct me if I misunderstood your points. Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. Exactly, yes. That's why we've integrated the source code of many other libraries we depend on into the JDK. It's the only way to get the reliability our users expect. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2007722519 From tholenstein at openjdk.org Tue Mar 19 17:26:22 2024 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Tue, 19 Mar 2024 17:26:22 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 17:08:52 GMT, Richard Reingruber wrote: > I see it differently. This PR is just a simple attempt to get clean test runs with AssertWXAtThreadSync (after fixing an actual crash https://bugs.openjdk.org/browse/JDK-8327036). While the violating locations in this PR might be unlikely to produce actual crashes I think it is worthwhile to have clean testing with AssertWXAtThreadSync because this will help prevent regressions that are more likely. I agree. Fixing the current state with this PR makes sense to me. Changing the logic of W^X will take more time and discussion. So from my point of view this PR is ready and should be integrated. If no-one disagrees I will approve ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007742855 From mli at openjdk.org Tue Mar 19 17:26:27 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Mar 2024 17:26:27 GMT Subject: RFR: 8328264: AArch64: remove UseNeon condition in CRC32 intrinsic [v2] In-Reply-To: <8KvtBJOhasYf_Ws5w-RJXNzl2e0S_YJC_aZIz2yQ6h4=.f9a1ce0e-9b7b-497b-8462-ad90008228d1@github.com> References: <2DAqxD8lF30SiWupwBYddbturljsnMn7qRXio1DfsIw=.43d8dd42-8aa4-421e-b2f4-55a4436a910c@github.com> <8KvtBJOhasYf_Ws5w-RJXNzl2e0S_YJC_aZIz2yQ6h4=.f9a1ce0e-9b7b-497b-8462-ad90008228d1@github.com> Message-ID: On Tue, 19 Mar 2024 17:11:21 GMT, Andrew Haley wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> revert indents > > OK. Trivial and good. Thanks @theRealAph for your reviewing! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18328#issuecomment-2007739785 From mli at openjdk.org Tue Mar 19 17:26:28 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Mar 2024 17:26:28 GMT Subject: Integrated: 8328264: AArch64: remove UseNeon condition in CRC32 intrinsic In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 15:32:21 GMT, Hamlin Li wrote: > Hi, > Can you review the simple patch? > Thanks > > FYI: Discussed https://github.com/openjdk/jdk/pull/18294#issuecomment-1997727704, this usage of UseNeon flag should be removed, as neon is working by default, and UseNeon could be false, so this means the intrinsic code of Neon can be skipped unexpected. > > ## Performance > Tested jmh `TestCRC32` with "-XX:-UseCRC32 -XX:-UseCryptoPmullForCRC32" > Before > > Benchmark (count) Mode Cnt Score Error Units > TestCRC32.testCRC32Update 64 avgt 2 53.696 ns/op > TestCRC32.testCRC32Update 128 avgt 2 104.942 ns/op > TestCRC32.testCRC32Update 256 avgt 2 207.147 ns/op > TestCRC32.testCRC32Update 512 avgt 2 411.179 ns/op > TestCRC32.testCRC32Update 2048 avgt 2 1608.388 ns/op > TestCRC32.testCRC32Update 16384 avgt 2 12763.513 ns/op > TestCRC32.testCRC32Update 65536 avgt 2 51024.246 ns/op > > > After > > Benchmark (count) Mode Cnt Score Error Units > TestCRC32.testCRC32Update 64 avgt 2 40.172 ns/op > TestCRC32.testCRC32Update 128 avgt 2 56.754 ns/op > TestCRC32.testCRC32Update 256 avgt 2 89.743 ns/op > TestCRC32.testCRC32Update 512 avgt 2 156.726 ns/op > TestCRC32.testCRC32Update 2048 avgt 2 579.776 ns/op > TestCRC32.testCRC32Update 16384 avgt 2 4624.023 ns/op > TestCRC32.testCRC32Update 65536 avgt 2 18505.180 ns/op This pull request has now been integrated. Changeset: 9ca4ae3d Author: Hamlin Li URL: https://git.openjdk.org/jdk/commit/9ca4ae3d3b746f1d75036d189ff98f02b73b948f Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8328264: AArch64: remove UseNeon condition in CRC32 intrinsic Reviewed-by: aph ------------- PR: https://git.openjdk.org/jdk/pull/18328 From tholenstein at openjdk.org Tue Mar 19 17:49:21 2024 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Tue, 19 Mar 2024 17:49:21 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: Message-ID: <6SKAa4aj0VpZeAKntqNZ_Nr4pQsCp0o3zXsph-0Wa98=.b7be4d1f-465c-4edc-b9f1-1b6589271442@github.com> On Wed, 13 Mar 2024 13:53:46 GMT, Richard Reingruber wrote: >> src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp line 160: >> >>> 158: ResourceMark rm(THREAD); >>> 159: // WXWrite is needed before entering the vm below and in callee methods. >>> 160: MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, THREAD)); >> >> I understand you placed this here to cover the transition inside `create_classes_array` and the immediate one at line 170, but doesn't this risk having the wrong WX state for code in between? > > I've asked this myself (after making the change). > Being in `WXWrite` mode would be wrong if the thread would execute dynamically generated code. There's not too much happening outside the scope of the `ThreadInVMfromNative` instances. I see jni calls (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call (`RetransformClasses`) but these are safe because the callees enter the vm right away. We even avoid switching to `WXWrite` and back there. > So I thought it would be ok to coarsen the WXMode switching. > But maybe it's still better to avoid any risk especially since there's likely no performance effect. Or could the `ThreadInVMfromNative tvmfn(THREAD);` in `check_exception_and_log` be move out to `JfrJvmtiAgent::retransform_classes`? And then only use one `ThreadInVMfromNative` in `JfrJvmtiAgent::retransform_classes` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1530831165 From stuefe at openjdk.org Tue Mar 19 18:34:24 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Mar 2024 18:34:24 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm I think this patch makes sense, and does not compete with a long-term solution. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947069985 From ayang at openjdk.org Tue Mar 19 18:51:26 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 19 Mar 2024 18:51:26 GMT Subject: RFR: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY [v2] In-Reply-To: <7zNLvOMdYc1vmoKzb7wSgZOaugZixZQwInKpOWDZJF0=.3da8fc5a-879a-42d3-b7e8-ccc5949679b4@github.com> References: <7zNLvOMdYc1vmoKzb7wSgZOaugZixZQwInKpOWDZJF0=.3da8fc5a-879a-42d3-b7e8-ccc5949679b4@github.com> Message-ID: On Tue, 19 Mar 2024 14:05:42 GMT, Albert Mingkun Yang wrote: >> Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - year > - g1-remove-DO_DISCOVERED_AND_DISCOVERY Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18346#issuecomment-2007896581 From ayang at openjdk.org Tue Mar 19 18:51:27 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 19 Mar 2024 18:51:27 GMT Subject: Integrated: 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 11:45:22 GMT, Albert Mingkun Yang wrote: > Remove a special ref-processing iteration mode for G1, because non-null `_discovered` field and requiring ref-discovery should never occur at the same time. > > Test: tier1-6 This pull request has now been integrated. Changeset: 7231fd78 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/7231fd78aada80946b680e10227a356714d5c34b Stats: 26 lines in 4 files changed: 0 ins; 22 del; 4 mod 8328350: G1: Remove DO_DISCOVERED_AND_DISCOVERY Reviewed-by: tschatzl, gli ------------- PR: https://git.openjdk.org/jdk/pull/18346 From mdoerr at openjdk.org Tue Mar 19 19:19:22 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 19 Mar 2024 19:19:22 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm +1 ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947160377 From tholenstein at openjdk.org Tue Mar 19 19:52:24 2024 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Tue, 19 Mar 2024 19:52:24 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Marked as reviewed by tholenstein (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947241693 From aph at openjdk.org Tue Mar 19 20:30:26 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 20:30:26 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947320660 From aph at openjdk.org Tue Mar 19 20:30:27 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 19 Mar 2024 20:30:27 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Tue, 19 Mar 2024 15:17:50 GMT, Thomas Stuefe wrote: > > Not necessarily. It may well remove some transitions from paths that don't need it, but if you move the state change too low down the call chain you could end up transitioning much more often in code that does need it e.g. if a transitioning method is called in a loop. > > Not if you do the switching lazily. The first iteration would switch to the needed state; subsequent iterations would not do anything since the state already matches. Unless you interleave writes and execs, but then you would need the state changes anyway. Exactly. You do the transition when it's needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2008070652 From dholmes at openjdk.org Tue Mar 19 22:53:28 2024 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Mar 2024 22:53:28 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> References: <23RHDqAAB2cUnhC2j-H7g2X2KibSmk2IwKh4EisMfYo=.a6d727da-76c9-42d6-b2df-d2fd192bc229@github.com> Message-ID: On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Not ideal but it fixes a real problem. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947550278 From cslucas at openjdk.org Tue Mar 19 23:18:55 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 19 Mar 2024 23:18:55 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: References: Message-ID: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> > ### Description > > Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. > > Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. > > The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. > > ### Benchmarking > > **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. > **Note 2:** Marging of error was negligible. > > | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | > |--------------------------------------|------------------|-------------------| > | TestTrapAfterMerge | 19.515 | 13.386 | > | TestArgEscape | 33.165 | 33.254 | > | TestCallTwoSide | 70.547 | 69.427 | > | TestCmpAfterMerge | 16.400 | 2.984 | > | TestCmpMergeWithNull_Second | 27.204 | 27.293 | > | TestCmpMergeWithNull | 8.248 | 4.920 | > | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | > | TestCondAfterMergeWithNull | 6.265 | 5.078 | > | TestCondLoadAfterMerge | 12.713 | 5.163 | > | TestConsecutiveSimpleMerge | 30.863 | 4.068 | > | TestDoubleIfElseMerge | 16.069 | 2.444 | > | TestEscapeInCallAfterMerge | 23.111 | 22.924 | > | TestGlobalEscape | 14.459 | 14.425 | > | TestIfElseInLoop | 246.061 | 42.786 | > | TestLoadAfterLoopAlias | 45.808 | 45.812 | > | TestLoadAfterTrap | 28.370 | ... Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Catching up with master - Fix broken build. - Merge with origin/master - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java Co-authored-by: Andrey Turbanov - Ammend previous fix & add repro tests. - Fix to prevent reducing already reduced Phi - Fix to prevent creating NULL ConNKlass constants. - Refrain from RAM of arrays and Phis controlled by Loop nodes. - Fix typo in test. - Fix build after merge. - ... and 2 more: https://git.openjdk.org/jdk/compare/7231fd78...620f9da8 ------------- Changes: https://git.openjdk.org/jdk/pull/15825/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=07 Stats: 2407 lines in 13 files changed: 2152 ins; 92 del; 163 mod Patch: https://git.openjdk.org/jdk/pull/15825.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15825/head:pull/15825 PR: https://git.openjdk.org/jdk/pull/15825 From pchilanomate at openjdk.org Wed Mar 20 02:33:22 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 20 Mar 2024 02:33:22 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v2] In-Reply-To: References: Message-ID: <6pArQVrMthR9Lwv2HkGzFhDz3pU28aIb0Vo-PDe6s5w=.89d12945-7151-4554-8897-94af619aec7d@github.com> On Tue, 19 Mar 2024 00:10:32 GMT, Serguei Spitsyn wrote: >> The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: correct one comment Looks good to me. src/hotspot/share/prims/jvmtiEnv.cpp line 1368: > 1366: if (java_thread != nullptr) { > 1367: Handle thread_handle(calling_thread, thread_oop); > 1368: EscapeBarrier eb(true, calling_thread, java_thread); I see that now we will execute the EscapeBarrier code for the vthread case too. We actually had to disable that for virtual threads since it doesn't work (8285739 and 8305625). But we only addressed the case when targeting all threads and not the single thread one. We would need to add the same check in EscapeBarrier::deoptimize_objects(int d1, int d2) to skip a thread with mounted continuation. ------------- PR Review: https://git.openjdk.org/jdk/pull/18332#pullrequestreview-1947846211 PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1531444319 From jzhu at openjdk.org Wed Mar 20 03:55:33 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Wed, 20 Mar 2024 03:55:33 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v4] In-Reply-To: References: Message-ID: > Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64. > Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits, > even the use of a floating point may cause the maximum 2048 bits stack occupied. > Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub. > > In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2 > > > ...... > 0x0000ffff684cfad8: stp x15, x18, [sp, #80] > 0x0000ffff684cfadc: sub sp, sp, #0x100 > 0x0000ffff684cfae0: str z16, [sp] > 0x0000ffff684cfae4: add x1, x13, #0x10 > 0x0000ffff684cfae8: mov x0, x16 > ;; 0xFFFF803F5414 > 0x0000ffff684cfaec: mov x8, #0x5414 // #21524 > 0x0000ffff684cfaf0: movk x8, #0x803f, lsl #16 > 0x0000ffff684cfaf4: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfaf8: blr x8 > 0x0000ffff684cfafc: mov x16, x0 > 0x0000ffff684cfb00: ldr z16, [sp] > 0x0000ffff684cfb04: add sp, sp, #0x100 > 0x0000ffff684cfb08: ptrue p7.b > 0x0000ffff684cfb0c: ldp x4, x5, [sp, #16] > ...... > > > could be optimized into: > > > ...... > 0x0000ffff684cfa50: stp x15, x18, [sp, #80] > 0x0000ffff684cfa54: str d16, [sp, #-16]! // extra 8 bytes to align 16 bytes in push_fp() > 0x0000ffff684cfa58: add x1, x13, #0x10 > 0x0000ffff684cfa5c: mov x0, x16 > ;; 0xFFFF7FA942A8 > 0x0000ffff684cfa60: mov x8, #0x42a8 // #17064 > 0x0000ffff684cfa64: movk x8, #0x7fa9, lsl #16 > 0x0000ffff684cfa68: movk x8, #0xffff, lsl #32 > 0x0000ffff684cfa6c: blr x8 > 0x0000ffff684cfa70: mov x16, x0 > 0x0000ffff684cfa74: ldr d16, [sp], #16 > 0x0000ffff684cfa78: ptrue p7.b > 0x0000ffff684cfa7c: ldp x4, x5, [sp, #16] > ...... > > > Besides the above benefit, when we know what size of register is live, > we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers. > > Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced. Joshua Zhu has updated the pull request incrementally with one additional commit since the last revision: Add more output for easy debugging once the jtreg test case fails ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17977/files - new: https://git.openjdk.org/jdk/pull/17977/files/382866f7..f2960eb1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17977&range=02-03 Stats: 21 lines in 1 file changed: 17 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17977.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17977/head:pull/17977 PR: https://git.openjdk.org/jdk/pull/17977 From jzhu at openjdk.org Wed Mar 20 03:55:34 2024 From: jzhu at openjdk.org (Joshua Zhu) Date: Wed, 20 Mar 2024 03:55:34 GMT Subject: RFR: 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers [v3] In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 14:37:38 GMT, Stuart Monteith wrote: > In the event of failure, would it be possible to print the erroneous output? The output from the subprocesses, being directly piped in, doesn't lend itself to easy debugging. At first I thought there might be an option that could alter OutputAnalyzers output, but sadly not. Done. Thanks for your comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17977#discussion_r1531483040 From jwaters at openjdk.org Wed Mar 20 05:44:48 2024 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 20 Mar 2024 05:44:48 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v7] In-Reply-To: References: Message-ID: > Compile the JDK as C++17, enabling the use of all C++17 language features Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge branch 'master' into patch-7 - Require clang 13 in toolchain.m4 - Remove unnecessary -std=c++17 option in Lib.gmk - Merge branch 'openjdk:master' into patch-7 - Compiler versions in toolchain.m4 - Merge branch 'openjdk:master' into patch-7 - Merge branch 'openjdk:master' into patch-7 - Revert vm_version_linux_riscv.cpp - vm_version_linux_riscv.cpp - allocation.cpp - ... and 1 more: https://git.openjdk.org/jdk/compare/269163d5...9286a964 ------------- Changes: https://git.openjdk.org/jdk/pull/14988/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=06 Stats: 4 lines in 2 files changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14988.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14988/head:pull/14988 PR: https://git.openjdk.org/jdk/pull/14988 From sspitsyn at openjdk.org Wed Mar 20 08:14:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Mar 2024 08:14:22 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v2] In-Reply-To: <6pArQVrMthR9Lwv2HkGzFhDz3pU28aIb0Vo-PDe6s5w=.89d12945-7151-4554-8897-94af619aec7d@github.com> References: <6pArQVrMthR9Lwv2HkGzFhDz3pU28aIb0Vo-PDe6s5w=.89d12945-7151-4554-8897-94af619aec7d@github.com> Message-ID: On Wed, 20 Mar 2024 02:10:28 GMT, Patricio Chilano Mateo wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: correct one comment > > src/hotspot/share/prims/jvmtiEnv.cpp line 1368: > >> 1366: if (java_thread != nullptr) { >> 1367: Handle thread_handle(calling_thread, thread_oop); >> 1368: EscapeBarrier eb(true, calling_thread, java_thread); > > I see that now we will execute the EscapeBarrier code for the vthread case too. We actually had to disable that for virtual threads since it doesn't work (8285739 and 8305625). But we only addressed the case when targeting all threads and not the single thread one. We would need to add the same check in EscapeBarrier::deoptimize_objects(int d1, int d2) to skip a thread with mounted continuation. Good suggestion, thanks! Would the following fix work ? : git diff diff --git a/src/hotspot/share/runtime/escapeBarrier.cpp b/src/hotspot/share/runtime/escapeBarrier.cpp index bc01d900285..1b6d57644dc 100644 --- a/src/hotspot/share/runtime/escapeBarrier.cpp +++ b/src/hotspot/share/runtime/escapeBarrier.cpp @@ -75,7 +75,7 @@ bool EscapeBarrier::deoptimize_objects(int d1, int d2) { // These frames are about to be removed. We must not interfere with that and signal failure. return false; } - if (deoptee_thread()->has_last_Java_frame()) { + if (deoptee_thread()->has_last_Java_frame() && deoptee_thread()->last_continuation() == nullptr) { assert(calling_thread() == Thread::current(), "should be"); KeepStackGCProcessedMark ksgcpm(deoptee_thread()); ResourceMark rm(calling_thread()); BTW, it seems the `PopFrame` and the `ForceEarlyReturn` are broken the same way and, I hope, the fix above should solve the issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1531653123 From sspitsyn at openjdk.org Wed Mar 20 08:53:49 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Mar 2024 08:53:49 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v3] In-Reply-To: References: Message-ID: > The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge - review: correct one comment - added a couple of more fields to the JvmtiUnifiedHaandshakeClosure - 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18332/files - new: https://git.openjdk.org/jdk/pull/18332/files/78b7c894..ebbf0a0f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=01-02 Stats: 294199 lines in 471 files changed: 6301 ins; 5813 del; 282085 mod Patch: https://git.openjdk.org/jdk/pull/18332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18332/head:pull/18332 PR: https://git.openjdk.org/jdk/pull/18332 From sjayagond at openjdk.org Wed Mar 20 09:13:21 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Wed, 20 Mar 2024 09:13:21 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 15:15:39 GMT, Martin Doerr wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Use Op_VecX instead of Op_RegF >> >> Signed-off-by: Sidraya > > src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 487: > >> 485: void RegisterSaver::restore_live_registers(MacroAssembler* masm, RegisterSet reg_set, bool save_vectors) { >> 486: int offset; >> 487: //const int register_save_offset = live_reg_frame_size(reg_set) - live_reg_save_size(reg_set); > > Do we need to keep this commented code? No, I delete it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1531721478 From sjayagond at openjdk.org Wed Mar 20 09:27:23 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Wed, 20 Mar 2024 09:27:23 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 15:21:37 GMT, Martin Doerr wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Use Op_VecX instead of Op_RegF >> >> Signed-off-by: Sidraya > > src/hotspot/cpu/s390/s390.ad line 7473: > >> 7471: // Sqrt float precision >> 7472: instruct sqrtF_reg(regF dst, regF src) %{ >> 7473: match(Set dst (SqrtF src)); > > Did the old code cause problems? Required nodes not matched? Maybe this should get fixed in a separate PR for backports? [JDK-8308277](https://bugs.openjdk.org/browse/JDK-8308277) This one was done for RISCV platform. If I do not make this change vectorization for float squreroot (Match.sqrt()) does not happen on s390x. Unlike RISCV I did not see ConvF2D -> ConvD2F IR Node sequence on s390x. Shall I raise a separate PR for this. ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1531741200 From erik.osterlund at oracle.com Wed Mar 20 09:35:11 2024 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 20 Mar 2024 09:35:11 +0000 Subject: RFC: JEP: CDS Archived Object Streaming Message-ID: <7602C9F9-5381-4E05-BB99-EB0F9D12FCCA@oracle.com> Hi, I have written a draft JEP for a new GC agnostic CDS object archiving mechanism that we intend to use at least in ZGC. JEP description is available here: https://bugs.openjdk.org/browse/JDK-8326035 Comments and feedback are welcome. Thanks, /Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdoerr at openjdk.org Wed Mar 20 09:40:22 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 20 Mar 2024 09:40:22 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: Message-ID: <1-JV6-zzCzVdCPfB02iYyYsouYehl0dAg4f2vzUIRpM=.f6825758-8e65-4cfb-bb04-8b4d698088da@github.com> On Wed, 20 Mar 2024 09:25:12 GMT, Sidraya Jayagond wrote: >> src/hotspot/cpu/s390/s390.ad line 7473: >> >>> 7471: // Sqrt float precision >>> 7472: instruct sqrtF_reg(regF dst, regF src) %{ >>> 7473: match(Set dst (SqrtF src)); >> >> Did the old code cause problems? Required nodes not matched? Maybe this should get fixed in a separate PR for backports? > > [JDK-8308277](https://bugs.openjdk.org/browse/JDK-8308277) This one was done for RISCV platform. If I do not make this change vectorization for float squreroot (Match.sqrt()) does not happen on s390x. Unlike RISCV I did not see ConvF2D -> ConvD2F IR Node sequence on s390x. > Shall I raise a separate PR for this. ? I think doing this in a separate PR makes sense. You may want to backport it to 21 and/or 17. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1531759589 From sjayagond at openjdk.org Wed Mar 20 09:45:21 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Wed, 20 Mar 2024 09:45:21 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: <1-JV6-zzCzVdCPfB02iYyYsouYehl0dAg4f2vzUIRpM=.f6825758-8e65-4cfb-bb04-8b4d698088da@github.com> References: <1-JV6-zzCzVdCPfB02iYyYsouYehl0dAg4f2vzUIRpM=.f6825758-8e65-4cfb-bb04-8b4d698088da@github.com> Message-ID: On Wed, 20 Mar 2024 09:37:34 GMT, Martin Doerr wrote: >> [JDK-8308277](https://bugs.openjdk.org/browse/JDK-8308277) This one was done for RISCV platform. If I do not make this change vectorization for float squreroot (Match.sqrt()) does not happen on s390x. Unlike RISCV I did not see ConvF2D -> ConvD2F IR Node sequence on s390x. >> Shall I raise a separate PR for this. ? > > I think doing this in a separate PR makes sense. You may want to backport it to 21 and/or 17. I raise separate PR. Thank you ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1531767102 From tschatzl at openjdk.org Wed Mar 20 09:52:20 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 20 Mar 2024 09:52:20 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v2] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 12:11:40 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > pgc-dead-ratio Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/shared/gc_globals.hpp line 623: > 621: "value. " \ > 622: "Parallel full gc treats this as maximum value, i.e. a non-fully" \ > 623: "compact full gc cycle wastes at most this value of space. " \ Suggestion: "Parallel full gc treats this as maximum value, i.e. when " \ "allowing dead wood, Parallel full gc wastes at most this amount "\ "of space." \ I admit I did not fully understand the original description, so this is my best guess what had been meant. First, "non-fully compact full gc cycle" sounds very awkward as "non-fully" isn't defined anywhere, and what is a full gc "cycle" wrt Parallel GC? Also s/value/amount. Maybe also change the term "mark sweep" for Serial to "full gc" above as well because it seems to be unnecessary to refer to the algorithm here. ------------- PR Review: https://git.openjdk.org/jdk/pull/18278#pullrequestreview-1948378420 PR Review Comment: https://git.openjdk.org/jdk/pull/18278#discussion_r1531769243 From ayang at openjdk.org Wed Mar 20 10:05:33 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 20 Mar 2024 10:05:33 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: > Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. > > Test: tier1-3 Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18278/files - new: https://git.openjdk.org/jdk/pull/18278/files/d60f12e5..013f3c05 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18278&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18278&range=01-02 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18278.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18278/head:pull/18278 PR: https://git.openjdk.org/jdk/pull/18278 From ayang at openjdk.org Wed Mar 20 10:05:34 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 20 Mar 2024 10:05:34 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v2] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 09:43:52 GMT, Thomas Schatzl wrote: >> Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: >> >> pgc-dead-ratio > > src/hotspot/share/gc/shared/gc_globals.hpp line 623: > >> 621: "value. " \ >> 622: "Parallel full gc treats this as maximum value, i.e. a non-fully" \ >> 623: "compact full gc cycle wastes at most this value of space. " \ > > Suggestion: > > "Parallel full gc treats this as maximum value, i.e. when " \ > "allowing dead wood, Parallel full gc wastes at most this amount "\ > "of space." \ > > I admit I did not fully understand the original description, so this is my best guess what had been meant. First, "non-fully compact full gc cycle" sounds very awkward as "non-fully" isn't defined anywhere, and what is a full gc "cycle" wrt Parallel GC? Also s/value/amount. > > Maybe also change the term "mark sweep" for Serial to "full gc" above as well because it seems to be unnecessary to refer to the algorithm here. I meant non-"MaximumCompaction". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18278#discussion_r1531795667 From shade at openjdk.org Wed Mar 20 10:43:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 20 Mar 2024 10:43:34 GMT Subject: RFR: 8316180: Thread-local backoff for secondary_super_cache updates [v14] In-Reply-To: References: Message-ID: On Mon, 8 Jan 2024 19:32:47 GMT, Aleksey Shipilev wrote: >> See more details in the bug and related issues. >> >> This is the attempt to mitigate [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450), while the more complex fix that would obviate the need for `secondary_super_cache` is being worked out. The goal for this fix is to improve performance in pathological cases, while keeping non-pathological cases out of extra risk, *and* staying simple enough and reliable for backports to currently supported JDK releases. >> >> This implements mitigation on most current architectures: >> - ? x86_64: implemented >> - ? x86_32: considered, abandoned; cannot be easily done without blowing up code size >> - ? AArch64: implemented >> - ? ARM32: considered, abandoned; needs cleanups and testing; see [JDK-8318414](https://bugs.openjdk.org/browse/JDK-8318414) >> - ? PPC64: implemented, thanks @TheRealMDoerr >> - ? S390: implemented, thanks @offamitkumar >> - ? RISC-V: implemented, thanks @RealFYang >> - ? Zero: does not need implementation >> >> Note that the code is supposed to be rather compact, because it is inlined in generated code. That is why, for example, we cannot easily do x86_32 version: we need a thread, so the easiest way would be to call into VM. But we cannot that easily: the code blowout would make some forward branches in external code non-short. I think we we cannot implement this mitigation on some architectures, so be it, it would be a sensible tradeoff for simplicity. >> >> Setting backoff at `0` effectively disables the mitigation, and gives us safety hatch if something goes wrong. >> >> I believe we can go in with `1000` as the default, given the experimental results mentioned in this PR. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1 tier2 tier3` >> - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: > > - Merge branch 'master' into JDK-8316180-backoff-secondary-super > - AArch64: Trying to backoff only for actual back-to-back updates > - Merge branch 'master' into JDK-8316180-backoff-secondary-super > - Improve benchmarks > - Merge branch 'master' into JDK-8316180-backoff-secondary-super > - Editorial cleanups > - RISC-V implementation > - Mention ARM32 bug > - Make sure benchmark runs with C1 > - Merge branch 'master' into JDK-8316180-backoff-secondary-super > - ... and 21 more: https://git.openjdk.org/jdk/compare/387828a3...d4254b67 Now that [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450) has a viable path forward, I see no point in continuing with this one. It has observable regressions on some workloads that are hard to overcome in a simple manner. Closing as WNF. Will reopen if JDK-8180450 is not in. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15718#issuecomment-2009247371 From tschatzl at openjdk.org Wed Mar 20 11:44:21 2024 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 20 Mar 2024 11:44:21 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: <2IY_A3IpWv1QN2LVWcrcALIbAts7GRdnWckz8ybKtxo=.5fb7edac-dac7-459d-859f-0d94026adcad@github.com> On Wed, 20 Mar 2024 10:05:33 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18278#pullrequestreview-1948647911 From mli at openjdk.org Wed Mar 20 12:00:22 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 20 Mar 2024 12:00:22 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v2] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> Message-ID: On Fri, 15 Mar 2024 11:25:37 GMT, Hamlin Li wrote: > > * There is no way to stop the VM from trying to load vmath ? > > No official way, but deleting libvmath.so will have a same effect. I'm not sure if avoiding loading vmath is necessary for typical users, if it turns out to be true, we can add it later. In previous implementation of introducing SVML into jdk on x86, it introduced a option `UseVectorStubs` which can be used to avoid using libvmath(either svml or sleef), i.e. user can fall back to java implementation if they want. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2009399949 From luhenry at openjdk.org Wed Mar 20 13:13:22 2024 From: luhenry at openjdk.org (Ludovic Henry) Date: Wed, 20 Mar 2024 13:13:22 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes The licensing [[1]](https://github.com/shibatch/sleef/blob/master/LICENSE.txt) seems to allow us [[2]](https://www.tldrlegal.com/license/boost-software-license-1-0-explained) to copy/paste the source code into the JDK. We would only need to include the copyright and license, which are both no-brainers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2009534637 From rrich at openjdk.org Wed Mar 20 13:55:22 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 20 Mar 2024 13:55:22 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3] In-Reply-To: <6SKAa4aj0VpZeAKntqNZ_Nr4pQsCp0o3zXsph-0Wa98=.b7be4d1f-465c-4edc-b9f1-1b6589271442@github.com> References: <6SKAa4aj0VpZeAKntqNZ_Nr4pQsCp0o3zXsph-0Wa98=.b7be4d1f-465c-4edc-b9f1-1b6589271442@github.com> Message-ID: <36GRb42wLD9-BacUH8qpyJHrKsuz5aN6WUdg6KwynnI=.c30ca337-98fd-43f4-9cb5-75968a3b265f@github.com> On Tue, 19 Mar 2024 17:46:31 GMT, Tobias Holenstein wrote: >> I've asked this myself (after making the change). >> Being in `WXWrite` mode would be wrong if the thread would execute dynamically generated code. There's not too much happening outside the scope of the `ThreadInVMfromNative` instances. I see jni calls (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call (`RetransformClasses`) but these are safe because the callees enter the vm right away. We even avoid switching to `WXWrite` and back there. >> So I thought it would be ok to coarsen the WXMode switching. >> But maybe it's still better to avoid any risk especially since there's likely no performance effect. > > Or could the `ThreadInVMfromNative tvmfn(THREAD);` in `check_exception_and_log` be move out to `JfrJvmtiAgent::retransform_classes`? And then only use one `ThreadInVMfromNative` in `JfrJvmtiAgent::retransform_classes` I think this would require hoisting the `ThreadInVMfromNative` out of the loop with the `check_exception_and_log` call but then the thread would be in `_thread_in_vm` when doing the `GetObjectArrayElement` jni call which would be wrong. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1532121487 From rrich at openjdk.org Wed Mar 20 14:06:04 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 20 Mar 2024 14:06:04 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v4] In-Reply-To: References: Message-ID: > This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite - 2 more locations - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite - Set WXWrite at more locations - Switch to WXWrite before entering the vm ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18238/files - new: https://git.openjdk.org/jdk/pull/18238/files/1b4d7606..8239638b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18238&range=02-03 Stats: 6136 lines in 243 files changed: 2379 ins; 2538 del; 1219 mod Patch: https://git.openjdk.org/jdk/pull/18238.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18238/head:pull/18238 PR: https://git.openjdk.org/jdk/pull/18238 From mdoerr at openjdk.org Wed Mar 20 14:06:04 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 20 Mar 2024 14:06:04 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v4] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 14:02:23 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - 2 more locations > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Good catch! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1948978584 From jsjolen at openjdk.org Wed Mar 20 15:05:09 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 20 Mar 2024 15:05:09 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT Message-ID: Hi, This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. ## `MemoryFileTracker` The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: ```c++ static MemoryFile* make_device(const char* descriptive_name); static void free_device(MemoryFile* device); static void allocate_memory(MemoryFile* device, size_t offset, size_t size, MEMFLAGS flag, const NativeCallStack& stack); static void free_memory(MemoryFile* device, size_t offset, size_t size); It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: ```c++ void ZNMT::reserve(zaddress_unsafe start, size_t size) { MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); } void ZNMT::commit(zoffset offset, size_t size) { MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); } void ZNMT::uncommit(zoffset offset, size_t size) { MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); } void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { // NMT doesn't track mappings at the moment. } void ZNMT::unmap(zaddress_unsafe addr, size_t size) { // NMT doesn't track mappings at the moment. } As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this is due to Thomas St?fe, kudos to him. 2. Separate storage of the `NativeCallStack`s in a closed addressing hashtable. NCS:s are large and there are relatively few of them occurring many times, so caching them makes sense. `NCS`:s are never deleted and the bucket array is never resized. # Example detailed output We add a section `Memory file details` when producing a detailed report. Memory file details Memory map of ZGC heap backing file [0x0000000000000000 - 0x0000000065200000] allocated 1696595968 for Java Heap [0x00007f3ecbdb8aea]ZPhysicalMemoryManager::commit(ZPhysicalMemory&)+0x9a [0x00007f3ecbdb211b]ZPageAllocator::commit_page(ZPage*)+0x33 [0x00007f3ecbdb2ad5]ZPageAllocator::alloc_page_finalize(ZPageAllocation*)+0x7b [0x00007f3ecbdb2c1b]ZPageAllocator::alloc_page(ZPageType, unsigned long, ZAllocationFlags, ZPageAge)+0xc7 # Missing features There is no detailed diffing of this data. ------------- Commit messages: - Use file instead of device - Add in atomic.hpp - Make some simplifications to TestZNMT - Use own PlatformMutex instead - Introduce a separate Mutex for MemoryFileTracker - Remove comment - Delete some code - Make addr_cmp belong to VMATree - Fix import ordering and adjust printing - Revert changes here - ... and 19 more: https://git.openjdk.org/jdk/compare/ae5e3fdd...08446455 Changes: https://git.openjdk.org/jdk/pull/18289/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312132 Stats: 1473 lines in 19 files changed: 1370 ins; 84 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Wed Mar 20 15:17:52 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 20 Mar 2024 15:17:52 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v2] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Include the mutex header ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/08446455..2751ba89 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Wed Mar 20 15:51:07 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 20 Mar 2024 15:51:07 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v3] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: ??? ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/2751ba89..4f761352 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From stefank at openjdk.org Wed Mar 20 15:51:08 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 20 Mar 2024 15:51:08 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v2] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 15:17:52 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Include the mutex header Nice! I went over the ZGC files and requested a few style updates. src/hotspot/share/gc/z/zInitialize.cpp line 36: > 34: #include "gc/z/zMarkStackAllocator.hpp" > 35: #include "gc/z/zNUMA.hpp" > 36: #include "gc/z/zNMT.hpp" Sort order src/hotspot/share/gc/z/zNMT.cpp line 37: > 35: > 36: void ZNMT::reserve(zaddress_unsafe start, size_t size) { > 37: MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); I'd prefer if we scale away the zaddress_unsafe type before casting to a pointer: Suggestion: MemTracker::record_virtual_memory_reserve((address)untype(start), size, CALLER_PC, mtJavaHeap); src/hotspot/share/gc/z/zNMT.cpp line 40: > 38: } > 39: void ZNMT::commit(zoffset offset, size_t size) { > 40: MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); Suggestion: MemTracker::allocate_memory_in(ZNMT::_device, untype(offset), size, mtJavaHeap, CALLER_PC); src/hotspot/share/gc/z/zNMT.cpp line 42: > 40: MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > 41: } > 42: void ZNMT::uncommit(zoffset offset, size_t size) { Please, add a blankline between the functions. src/hotspot/share/gc/z/zNMT.cpp line 43: > 41: } > 42: void ZNMT::uncommit(zoffset offset, size_t size) { > 43: MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); Suggestion: MemTracker::free_memory_in(ZNMT::_device, untype(offset), size); src/hotspot/share/gc/z/zNMT.cpp line 49: > 47: // NMT doesn't track mappings at the moment. > 48: } > 49: void ZNMT::unmap(zaddress_unsafe addr, size_t size) { blankline src/hotspot/share/gc/z/zNMT.hpp line 34: > 32: #include "utilities/globalDefinitions.hpp" > 33: #include "utilities/nativeCallStack.hpp" > 34: #include "nmt/memTracker.hpp" Sort order src/hotspot/share/gc/z/zNMT.hpp line 39: > 37: class ZNMT : public AllStatic { > 38: private: > 39: static MemoryFileTracker::MemoryFile* _device; Blankline after this line src/hotspot/share/gc/z/zNMT.hpp line 43: > 41: static void init(); > 42: static void map(zaddress_unsafe addr, size_t size, zoffset offset); > 43: static void unmap(zaddress_unsafe addr, size_t size); ZGC tries to keep the definitions and declarations in the same order in the .hpp and .cpp files. Please, move this accordingly. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-1949288929 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532326193 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532329588 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532330918 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532332161 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532332658 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532333077 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532334535 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532334938 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1532336445 From pchilanomate at openjdk.org Wed Mar 20 16:02:20 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 20 Mar 2024 16:02:20 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v2] In-Reply-To: References: <6pArQVrMthR9Lwv2HkGzFhDz3pU28aIb0Vo-PDe6s5w=.89d12945-7151-4554-8897-94af619aec7d@github.com> Message-ID: On Wed, 20 Mar 2024 08:12:06 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 1368: >> >>> 1366: if (java_thread != nullptr) { >>> 1367: Handle thread_handle(calling_thread, thread_oop); >>> 1368: EscapeBarrier eb(true, calling_thread, java_thread); >> >> I see that now we will execute the EscapeBarrier code for the vthread case too. We actually had to disable that for virtual threads since it doesn't work (8285739 and 8305625). But we only addressed the case when targeting all threads and not the single thread one. We would need to add the same check in EscapeBarrier::deoptimize_objects(int d1, int d2) to skip a thread with mounted continuation. > > Good suggestion, thanks! > Would the following fix work ? : > > > git diff > diff --git a/src/hotspot/share/runtime/escapeBarrier.cpp b/src/hotspot/share/runtime/escapeBarrier.cpp > index bc01d900285..1b6d57644dc 100644 > --- a/src/hotspot/share/runtime/escapeBarrier.cpp > +++ b/src/hotspot/share/runtime/escapeBarrier.cpp > @@ -75,7 +75,7 @@ bool EscapeBarrier::deoptimize_objects(int d1, int d2) { > // These frames are about to be removed. We must not interfere with that and signal failure. > return false; > } > - if (deoptee_thread()->has_last_Java_frame()) { > + if (deoptee_thread()->has_last_Java_frame() && deoptee_thread()->last_continuation() == nullptr) { > assert(calling_thread() == Thread::current(), "should be"); > KeepStackGCProcessedMark ksgcpm(deoptee_thread()); > ResourceMark rm(calling_thread()); > > > BTW, it seems the `PopFrame` and the `ForceEarlyReturn` are broken the same way and, I hope, the fix above should solve the issue. Fix looks good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1532363608 From jsjolen at openjdk.org Wed Mar 20 16:25:57 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 20 Mar 2024 16:25:57 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v4] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Move into cpp file, fix linking? ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/4f761352..599f2ce8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=02-03 Stats: 14 lines in 2 files changed: 8 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From sjayagond at openjdk.org Wed Mar 20 16:32:50 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Wed, 20 Mar 2024 16:32:50 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v5] In-Reply-To: References: Message-ID: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> > This PR Adds SIMD support on s390x. Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: Address code cleanup review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18162/files - new: https://git.openjdk.org/jdk/pull/18162/files/3caa470c..25c06768 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=03-04 Stats: 30 lines in 5 files changed: 1 ins; 1 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From mdoerr at openjdk.org Wed Mar 20 16:57:20 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 20 Mar 2024 16:57:20 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: <1-JV6-zzCzVdCPfB02iYyYsouYehl0dAg4f2vzUIRpM=.f6825758-8e65-4cfb-bb04-8b4d698088da@github.com> Message-ID: On Wed, 20 Mar 2024 09:42:19 GMT, Sidraya Jayagond wrote: >> I think doing this in a separate PR makes sense. You may want to backport it to 21 and/or 17. > > I raise separate PR. Thank you I think we should integrate that one first and then merge with this PR. Your updates look good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1532464493 From jsjolen at openjdk.org Wed Mar 20 17:06:06 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 20 Mar 2024 17:06:06 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: References: Message-ID: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Include os.inline.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/599f2ce8..1928b9da Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=03-04 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From sjayagond at openjdk.org Wed Mar 20 17:47:20 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Wed, 20 Mar 2024 17:47:20 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v4] In-Reply-To: References: <1-JV6-zzCzVdCPfB02iYyYsouYehl0dAg4f2vzUIRpM=.f6825758-8e65-4cfb-bb04-8b4d698088da@github.com> Message-ID: On Wed, 20 Mar 2024 16:54:42 GMT, Martin Doerr wrote: >> I raise separate PR. Thank you > > I think we should integrate that one first and then merge with this PR. Your updates look good. @TheRealMDoerr [jdk-8328633](https://github.com/openjdk/jdk/pull/18406) Here is the the PR I have raised for this. Could you review an approve it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1532538300 From ccheung at openjdk.org Wed Mar 20 18:05:31 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 20 Mar 2024 18:05:31 GMT Subject: RFR: 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent Message-ID: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> The bug only reproduces with dynamic dump. A fix is to call `assign_class_loader_type()` in `KlassFactory::check_shared_class_file_load_hook()` instead of in `ArchiveBuilder::make_klasses_shareable()`. Passed tiers 1 - 4 testing. ------------- Commit messages: - call assign_class_loader_type() in KlassFactory - Revert "factor out assigning of class loader type out of make_klasses_shareable" - factor out assigning of class loader type out of make_klasses_shareable - 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent Changes: https://git.openjdk.org/jdk/pull/18254/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18254&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8325536 Stats: 68 lines in 5 files changed: 60 ins; 4 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18254/head:pull/18254 PR: https://git.openjdk.org/jdk/pull/18254 From iklam at openjdk.org Wed Mar 20 18:12:23 2024 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 20 Mar 2024 18:12:23 GMT Subject: RFR: 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent In-Reply-To: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> References: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> Message-ID: On Tue, 12 Mar 2024 22:10:16 GMT, Calvin Cheung wrote: > The bug only reproduces with dynamic dump. > A fix is to call `assign_class_loader_type()` in `KlassFactory::check_shared_class_file_load_hook()` instead of in `ArchiveBuilder::make_klasses_shareable()`. > > Passed tiers 1 - 4 testing. LGTM. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18254#pullrequestreview-1949695319 From aph at openjdk.org Wed Mar 20 18:18:33 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 20 Mar 2024 18:18:33 GMT Subject: RFR: 8316180: Thread-local backoff for secondary_super_cache updates [v14] In-Reply-To: References: Message-ID: <1zwzeGgbx_omfZ3Nutjnt3FckW5dytgDUSwoTZq9NYI=.d0a254cb-856c-4287-a0d6-2eeb178c2861@github.com> On Wed, 20 Mar 2024 10:41:08 GMT, Aleksey Shipilev wrote: > Now that [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450) has a viable path forward, I see no point in continuing with this one. It has observable regressions on some workloads that are hard to overcome in a simple manner. Closing as WNF. Will reopen if JDK-8180450 is not in. May I grab the test cases from this PR and add them into mine? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15718#issuecomment-2010298790 From shade at openjdk.org Wed Mar 20 18:29:30 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 20 Mar 2024 18:29:30 GMT Subject: RFR: 8316180: Thread-local backoff for secondary_super_cache updates [v14] In-Reply-To: <1zwzeGgbx_omfZ3Nutjnt3FckW5dytgDUSwoTZq9NYI=.d0a254cb-856c-4287-a0d6-2eeb178c2861@github.com> References: <1zwzeGgbx_omfZ3Nutjnt3FckW5dytgDUSwoTZq9NYI=.d0a254cb-856c-4287-a0d6-2eeb178c2861@github.com> Message-ID: <-tBC5-b7Vh-ylXCltq3CJD6NA1cNuwHCgloQqiDK7UY=.9f9bd528-225b-4a8d-89e1-fea068359c6f@github.com> On Wed, 20 Mar 2024 18:15:38 GMT, Andrew Haley wrote: > May I grab the test cases from this PR and add them into mine? Thanks. Sure, feel free. Not sure how useful are they. I pushed some test updates directly into my branch: https://github.com/openjdk/jdk/compare/master...shipilev:jdk:JDK-8316180-backoff-secondary-super ------------- PR Comment: https://git.openjdk.org/jdk/pull/15718#issuecomment-2010318724 From sspitsyn at openjdk.org Wed Mar 20 21:59:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Mar 2024 21:59:22 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v2] In-Reply-To: References: <6pArQVrMthR9Lwv2HkGzFhDz3pU28aIb0Vo-PDe6s5w=.89d12945-7151-4554-8897-94af619aec7d@github.com> Message-ID: On Wed, 20 Mar 2024 15:59:34 GMT, Patricio Chilano Mateo wrote: >> Good suggestion, thanks! >> Would the following fix work ? : >> >> >> git diff >> diff --git a/src/hotspot/share/runtime/escapeBarrier.cpp b/src/hotspot/share/runtime/escapeBarrier.cpp >> index bc01d900285..1b6d57644dc 100644 >> --- a/src/hotspot/share/runtime/escapeBarrier.cpp >> +++ b/src/hotspot/share/runtime/escapeBarrier.cpp >> @@ -75,7 +75,7 @@ bool EscapeBarrier::deoptimize_objects(int d1, int d2) { >> // These frames are about to be removed. We must not interfere with that and signal failure. >> return false; >> } >> - if (deoptee_thread()->has_last_Java_frame()) { >> + if (deoptee_thread()->has_last_Java_frame() && deoptee_thread()->last_continuation() == nullptr) { >> assert(calling_thread() == Thread::current(), "should be"); >> KeepStackGCProcessedMark ksgcpm(deoptee_thread()); >> ResourceMark rm(calling_thread()); >> >> >> BTW, it seems the `PopFrame` and the `ForceEarlyReturn` are broken the same way and, I hope, the fix above should solve the issue. > > Fix looks good. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1532932393 From sspitsyn at openjdk.org Wed Mar 20 22:03:30 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Mar 2024 22:03:30 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v4] In-Reply-To: References: Message-ID: <2cbNK6BXIQuVDhocGrTYCLdHtsrHIs82LVW1oWGXBKg=.0de86729-cc55-425c-b3e1-44152ba29f02@github.com> > The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: work around problem for vthreads in EscapeBarrier::deoptimize_objects ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18332/files - new: https://git.openjdk.org/jdk/pull/18332/files/ebbf0a0f..b7380b19 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=02-03 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18332/head:pull/18332 PR: https://git.openjdk.org/jdk/pull/18332 From lmesnik at openjdk.org Wed Mar 20 22:44:22 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 20 Mar 2024 22:44:22 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v4] In-Reply-To: <2cbNK6BXIQuVDhocGrTYCLdHtsrHIs82LVW1oWGXBKg=.0de86729-cc55-425c-b3e1-44152ba29f02@github.com> References: <2cbNK6BXIQuVDhocGrTYCLdHtsrHIs82LVW1oWGXBKg=.0de86729-cc55-425c-b3e1-44152ba29f02@github.com> Message-ID: On Wed, 20 Mar 2024 22:03:30 GMT, Serguei Spitsyn wrote: >> The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: work around problem for vthreads in EscapeBarrier::deoptimize_objects src/hotspot/share/prims/jvmtiEnv.cpp line 1366: > 1364: } > 1365: > 1366: if (java_thread != nullptr) { I am not sure why this check is needed at this level. It is duplication.The GetOwnedMonitorInfoClosure::do_vthread checks if thread is null (no-op for unmounted threads). Currently, the unmounted threads can't own monitor. But if it's changed we need to update this check as well as the implementation of handshakes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1532988358 From sspitsyn at openjdk.org Thu Mar 21 01:04:22 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Mar 2024 01:04:22 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v4] In-Reply-To: References: <2cbNK6BXIQuVDhocGrTYCLdHtsrHIs82LVW1oWGXBKg=.0de86729-cc55-425c-b3e1-44152ba29f02@github.com> Message-ID: On Wed, 20 Mar 2024 22:41:05 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: work around problem for vthreads in EscapeBarrier::deoptimize_objects > > src/hotspot/share/prims/jvmtiEnv.cpp line 1366: > >> 1364: } >> 1365: >> 1366: if (java_thread != nullptr) { > > I am not sure why this check is needed at this level. It is duplication.The GetOwnedMonitorInfoClosure::do_vthread checks if thread is null (no-op for unmounted threads). > Currently, the unmounted threads can't own monitor. But if it's changed we need to update this check as well as the implementation of handshakes. This is needed to avoid using the `EscapeBarrier` for unmounted virtual threads. But you are right there is a duplication here - thanks. The following check can be removed: +GetOwnedMonitorInfoClosure::do_vthread(Handle target_h) { + if (_target_jt == nullptr) { + _result = JVMTI_ERROR_NONE; + return; + } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1533109911 From sspitsyn at openjdk.org Thu Mar 21 01:09:52 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Mar 2024 01:09:52 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v5] In-Reply-To: References: Message-ID: > The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: replace redundand check with an assert ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18332/files - new: https://git.openjdk.org/jdk/pull/18332/files/b7380b19..429e696a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18332&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18332/head:pull/18332 PR: https://git.openjdk.org/jdk/pull/18332 From lmesnik at openjdk.org Thu Mar 21 01:13:21 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 21 Mar 2024 01:13:21 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v5] In-Reply-To: References: Message-ID: <_NGc1wEEnXExWS5H29dLtyE1fbFsS7-D0ViSuFOyZfg=.b3ca08c5-c926-4a32-b6a3-74203324ed41@github.com> On Thu, 21 Mar 2024 01:09:52 GMT, Serguei Spitsyn wrote: >> The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: replace redundand check with an assert Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18332#pullrequestreview-1950561720 From lmesnik at openjdk.org Thu Mar 21 01:13:23 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Thu, 21 Mar 2024 01:13:23 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v4] In-Reply-To: References: <2cbNK6BXIQuVDhocGrTYCLdHtsrHIs82LVW1oWGXBKg=.0de86729-cc55-425c-b3e1-44152ba29f02@github.com> Message-ID: On Thu, 21 Mar 2024 01:01:22 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnv.cpp line 1366: >> >>> 1364: } >>> 1365: >>> 1366: if (java_thread != nullptr) { >> >> I am not sure why this check is needed at this level. It is duplication.The GetOwnedMonitorInfoClosure::do_vthread checks if thread is null (no-op for unmounted threads). >> Currently, the unmounted threads can't own monitor. But if it's changed we need to update this check as well as the implementation of handshakes. > > This is needed to avoid using the `EscapeBarrier` for unmounted virtual threads. > But you are right there is a duplication here - thanks. > The following check can be removed: > > +GetOwnedMonitorInfoClosure::do_vthread(Handle target_h) { > + if (_target_jt == nullptr) { > + _result = JVMTI_ERROR_NONE; > + return; > + } > > I've replaced it with an assert: > > assert(_target_jt != nullptr, "sanity check"); Thanks, that should work! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18332#discussion_r1533118435 From iwalulya at openjdk.org Thu Mar 21 10:05:22 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 21 Mar 2024 10:05:22 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 10:05:33 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18278#pullrequestreview-1951550637 From sjohanss at openjdk.org Thu Mar 21 10:05:22 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 21 Mar 2024 10:05:22 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 10:05:33 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review I think the code change looks good. The new solution is easier to follow and better match the other GCs. But I have one question, have you done any testing to see how this changes the behavior of the Parallel Full GC for different scenarios. To me this isn't a simple refactoring as we actually change the behavior, so I would feel much more confident approving this knowing any potential impact it would have. ------------- PR Review: https://git.openjdk.org/jdk/pull/18278#pullrequestreview-1951558533 From mcimadamore at openjdk.org Thu Mar 21 11:09:26 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 21 Mar 2024 11:09:26 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp Message-ID: This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. ------------- Commit messages: - Tweak UNSAFE_ENTRY_SCOPED comment Changes: https://git.openjdk.org/jdk/pull/18429/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18429&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328679 Stats: 20 lines in 1 file changed: 3 ins; 2 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18429.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18429/head:pull/18429 PR: https://git.openjdk.org/jdk/pull/18429 From ayang at openjdk.org Thu Mar 21 11:30:20 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 21 Mar 2024 11:30:20 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: <75mOIRdr2gYWOhMOYNLi_59RMHp3iGAjw-yfRrP3KgA=.b5c47bcf-cdef-49a2-ba00-b0e541b0b816@github.com> On Wed, 20 Mar 2024 10:05:33 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review I tested Dacapo/CacheStresser/specjbb15-fixed-injection-rate and there is no observable difference. I also tried a synthetic bm to stress full-gc and the difference in pause-time is within the noise. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18278#issuecomment-2012014900 From pchilanomate at openjdk.org Thu Mar 21 13:56:20 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 21 Mar 2024 13:56:20 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v5] In-Reply-To: References: Message-ID: <6ziCbagwDerlKMmIfd-_uNk2Jtu2cHbE4ZXmrkNQnh0=.1bea663e-4997-4fa3-a1af-17eefa52db3a@github.com> On Thu, 21 Mar 2024 01:09:52 GMT, Serguei Spitsyn wrote: >> The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: replace redundand check with an assert Looks good to me, thanks Serguei. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18332#pullrequestreview-1952294929 From rrich at openjdk.org Thu Mar 21 14:12:26 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 21 Mar 2024 14:12:26 GMT Subject: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v4] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 14:06:04 GMT, Richard Reingruber wrote: >> Updated (2024-03-20): >> >> This PR adds switching to `WXWrite` mode before entering the vm where it is missing. >> >> With the changes the following jtreg tests succeed with AssertWXAtThreadSync enabled. >> >> * hotspot tier 1-4 >> * jdk tier 1-4 >> * langtools >> * jaxp > > Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - 2 more locations > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Tests with AssertWXAtThreadSync are clean now. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2012399791 From rrich at openjdk.org Thu Mar 21 14:12:27 2024 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 21 Mar 2024 14:12:27 GMT Subject: Integrated: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync In-Reply-To: References: Message-ID: On Tue, 12 Mar 2024 15:05:00 GMT, Richard Reingruber wrote: > Updated (2024-03-20): > > This PR adds switching to `WXWrite` mode before entering the vm where it is missing. > > With the changes the following jtreg tests succeed with AssertWXAtThreadSync enabled. > > * hotspot tier 1-4 > * jdk tier 1-4 > * langtools > * jaxp This pull request has now been integrated. Changeset: e41bc42d Author: Richard Reingruber URL: https://git.openjdk.org/jdk/commit/e41bc42deb22615c9b93ee639d04e9ed2bd57f64 Stats: 13 lines in 8 files changed: 11 ins; 0 del; 2 mod 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync Reviewed-by: dholmes, stuefe, mdoerr, tholenstein, aph ------------- PR: https://git.openjdk.org/jdk/pull/18238 From sjohanss at openjdk.org Thu Mar 21 14:24:21 2024 From: sjohanss at openjdk.org (Stefan Johansson) Date: Thu, 21 Mar 2024 14:24:21 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 10:05:33 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review No observable difference sounds good. ------------- Marked as reviewed by sjohanss (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18278#pullrequestreview-1952377685 From aph at openjdk.org Thu Mar 21 14:42:55 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 21 Mar 2024 14:42:55 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/bd23d194..6c596d70 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From ihse at openjdk.org Thu Mar 21 15:47:21 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 21 Mar 2024 15:47:21 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> Message-ID: On Tue, 19 Mar 2024 17:15:29 GMT, Andrew Haley wrote: >>> If there was some kind of plan, with evidence of the intention to do something to get this valuable tech into people's hands in a form they can use, sure. But as you can tell, I think this may rot because no one will be able use it. If SLEEF were included in the JDK it'd be fine. if SLEEF were a common Linux library it'd be fine. >>> >>> The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. >> >> Please kindly correct me if I misunderstood your points. >> Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. > >> > The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. >> >> Please kindly correct me if I misunderstood your points. Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. > > Exactly, yes. That's why we've integrated the source code of many other libraries we depend on into the JDK. It's the only way to get the reliability our users expect. @theRealAph Are you saying that bundling the source code of libsleef is a hard requirement from your side to accept this code into the JDK? I'm not against it, I just want to understand what we're talking about here. In general, adding new libraries to OpenJDK will require a legal process in Oracle, which may (or may not) take some amount of time T, where T is larger than you'd wish for. So I guess we can either: 1) wait for libsleef source code to become a part of OpenJDK, and then integrate this PR. 2) integrate this PR optimistically, and in the background start a process of trying to get libsleef into OpenJDK. (Which, of course, can not be 100% guaranteed to happen.) 3) integrate this PR as is, and give up any idea of bundling libsleef. I also believe there is a fourth option, but that too seems like it has legal implications that needs to be checked: 4) if the libsleef dynamic library is found on the system during build time, bundle a copy of the dll with the built JDK. (Similar to how was done with freetype on Windows before.). And, optionally, provide an option for configure to require libsleef to be present, so the build fails if no libsleef can be found and bundled. (Leaving open as to if this should be default or not.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2012707195 From ayang at openjdk.org Thu Mar 21 15:50:25 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 21 Mar 2024 15:50:25 GMT Subject: RFR: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev [v3] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 10:05:33 GMT, Albert Mingkun Yang wrote: >> Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. >> >> Test: tier1-3 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18278#issuecomment-2012717783 From ayang at openjdk.org Thu Mar 21 15:50:26 2024 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 21 Mar 2024 15:50:26 GMT Subject: Integrated: 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev In-Reply-To: References: Message-ID: On Wed, 13 Mar 2024 15:22:53 GMT, Albert Mingkun Yang wrote: > Simple refactoring Parallel full-gc dead word calculation and removing two jvm flags. > > Test: tier1-3 This pull request has now been integrated. Changeset: 16ed1913 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/16ed191329b517766313761c8f3450b4a8285658 Stats: 245 lines in 7 files changed: 8 ins; 218 del; 19 mod 8328101: Parallel: Obsolete ParallelOldDeadWoodLimiterMean and ParallelOldDeadWoodLimiterStdDev Reviewed-by: tschatzl, iwalulya, sjohanss ------------- PR: https://git.openjdk.org/jdk/pull/18278 From aph at openjdk.org Thu Mar 21 16:07:24 2024 From: aph at openjdk.org (Andrew Haley) Date: Thu, 21 Mar 2024 16:07:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> Message-ID: On Tue, 19 Mar 2024 17:15:29 GMT, Andrew Haley wrote: >>> If there was some kind of plan, with evidence of the intention to do something to get this valuable tech into people's hands in a form they can use, sure. But as you can tell, I think this may rot because no one will be able use it. If SLEEF were included in the JDK it'd be fine. if SLEEF were a common Linux library it'd be fine. >>> >>> The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. >> >> Please kindly correct me if I misunderstood your points. >> Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. > >> > The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. >> >> Please kindly correct me if I misunderstood your points. Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. > > Exactly, yes. That's why we've integrated the source code of many other libraries we depend on into the JDK. It's the only way to get the reliability our users expect. > @theRealAph Are you saying that bundling the source code of libsleef is a hard requirement from your side to accept this code into the JDK? > > I'm not against it, I just want to understand what we're talking about here. > > In general, adding new libraries to OpenJDK will require a legal process in Oracle, which may (or may not) take some amount of time T, where T is larger than you'd wish for. > > So I guess we can either: > > 1. wait for libsleef source code to become a part of OpenJDK, and then integrate this PR. > > 2. integrate this PR optimistically, and in the background start a process of trying to get libsleef into OpenJDK. (Which, of course, can not be 100% guaranteed to happen.) I think that's probably best the best thing to do today. I will approve it. I'd prefer, long term, to integrate SLEEF into OpenJDK. That would make AArch64 support similar to Intel. We are competing with Intel. In the short term, I'd build a shim library during the default standard JDK build that does not need SLEEF at build time. Unless weo do that, because SLEEF isn't on anyone's builders, it won't be used. > 3. integrate this PR as is, and give up any idea of bundling libsleef. > > > I also believe there is a fourth option, but that too seems like it has legal implications that needs to be checked: > > 4. if the libsleef dynamic library is found on the system during build time, bundle a copy of the dll with the built JDK. (Similar to how was done with freetype on Windows before.). And, optionally, provide an option for configure to require libsleef to be present, so the build fails if no libsleef can be found and bundled. (Leaving open as to if this should be default or not.) Ewww. No. :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2012798235 From sspitsyn at openjdk.org Thu Mar 21 16:33:24 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Mar 2024 16:33:24 GMT Subject: RFR: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake [v5] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 01:09:52 GMT, Serguei Spitsyn wrote: >> The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. >> >> Testing: >> - Ran mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: replace redundand check with an assert Patricio and Leonid, thank you for prompt review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18332#issuecomment-2012905000 From cjplummer at openjdk.org Thu Mar 21 16:35:23 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 21 Mar 2024 16:35:23 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: <3l4gFuzQsROoNsiekZMTZOuyOGn-BS-F0B5URTrAJEs=.d132a4f7-f7f5-4c41-a3de-9b0d97151304@github.com> On Thu, 21 Mar 2024 14:42:55 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Windows builds are failing: d:\a\jdk\jdk\src\hotspot\share\oops/klass.hpp(251): error C2220: the following warning is treated as an error d:\a\jdk\jdk\src\hotspot\share\oops/klass.hpp(251): warning C4267: 'return': conversion from 'size_t' to 'int', possible loss of data ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2012911928 From cjplummer at openjdk.org Thu Mar 21 16:58:25 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 21 Mar 2024 16:58:25 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 14:42:55 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well Changes requested by cjplummer (Reviewer). src/hotspot/share/runtime/globals.hpp line 2002: > 2000: \ > 2001: product(bool, HashSecondarySupers, false, DIAGNOSTIC, \ > 2002: "Use hash table to scan secondary supers.") \ Fix indent. src/hotspot/share/runtime/globals.hpp line 2005: > 2003: \ > 2004: product(bool, VerifySecondarySupers, false, DIAGNOSTIC, \ > 2005: "Use secondary super cache during subtype checks") \ Did you mean "Verify secondary super cache..."? ------------- PR Review: https://git.openjdk.org/jdk/pull/18309#pullrequestreview-1952823901 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1534288014 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1534288738 From cjplummer at openjdk.org Thu Mar 21 17:03:25 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 21 Mar 2024 17:03:25 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v4] In-Reply-To: References: Message-ID: <3QxVeYQ-AWyIQBukcKa9P0JnAPtsdoulNwXoSV8scP8=.71575650-4769-412e-9b81-21840af93923@github.com> On Tue, 19 Mar 2024 10:01:49 GMT, Andrew Haley wrote: >>> Hi Andrew, we'd most likely want to port this to Graal. Just for my understanding, the current implementation of [secondary super cache lookup in Graal](https://github.com/oracle/graal/blob/0a471fc329ea2c8c18c05f9d83347dc1199c3ad9/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/replacements/InstanceOfSnippets.java#L169) will continue to work until we get around to doing the port right? >> >> Absolutely, yes. This is purely an add-on for now. > >> > Hi Andrew, we'd most likely want to port this to Graal. Just for my understanding, the current implementation of [secondary super cache lookup in Graal](https://github.com/oracle/graal/blob/0a471fc329ea2c8c18c05f9d83347dc1199c3ad9/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/replacements/InstanceOfSnippets.java#L169) will continue to work until we get around to doing the port right? >> >> Absolutely, yes. This is purely an add-on for now. > > To clarify: I can't see anything in there that is sensitive to the order in which the secondary supers appear, and unless code looks at the hash bitmap, the only change it sees is the order of the secondaries. @theRealAph I applied your patch and ran SA tests. I did not see the failures you mentioned to me. Is there anything else special I need to do? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2013022762 From mli at openjdk.org Thu Mar 21 17:28:22 2024 From: mli at openjdk.org (Hamlin Li) Date: Thu, 21 Mar 2024 17:28:22 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> Message-ID: On Thu, 21 Mar 2024 15:43:28 GMT, Magnus Ihse Bursie wrote: >>> > The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. >>> >>> Please kindly correct me if I misunderstood your points. Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. >> >> Exactly, yes. That's why we've integrated the source code of many other libraries we depend on into the JDK. It's the only way to get the reliability our users expect. > > @theRealAph Are you saying that bundling the source code of libsleef is a hard requirement from your side to accept this code into the JDK? > > I'm not against it, I just want to understand what we're talking about here. > > In general, adding new libraries to OpenJDK will require a legal process in Oracle, which may (or may not) take some amount of time T, where T is larger than you'd wish for. > > So I guess we can either: > 1) wait for libsleef source code to become a part of OpenJDK, and then integrate this PR. > 2) integrate this PR optimistically, and in the background start a process of trying to get libsleef into OpenJDK. (Which, of course, can not be 100% guaranteed to happen.) > 3) integrate this PR as is, and give up any idea of bundling libsleef. > > I also believe there is a fourth option, but that too seems like it has legal implications that needs to be checked: > > 4) if the libsleef dynamic library is found on the system during build time, bundle a copy of the dll with the built JDK. (Similar to how was done with freetype on Windows before.). And, optionally, provide an option for configure to require libsleef to be present, so the build fails if no libsleef can be found and bundled. (Leaving open as to if this should be default or not.) @magicus @theRealAph Thanks for discussion. > I think that's probably best the best thing to do today. I will approve it. Great, Thanks. > I'd prefer, long term, to integrate SLEEF into OpenJDK. That would make AArch64 support similar to Intel. We are competing with Intel. I'm working on it, running some tests. But I think as @magicus pointed out, to achieve this target one of most important tasks is to get the legal process done. > In the short term, I'd build a shim library during the default standard JDK build that does not need SLEEF at build time. Unless weo do that, because SLEEF isn't on anyone's builders, it won't be used. I agree it's give user more chance to leverage the sleef when running, but I wonder if it's necessary to do that. As we have a long term solution, and the chance that end user lacks sleef library in their environment is much higher than release engineer lacks sleef library in their environment. But it's not harm to do so. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2013123992 From jvernee at openjdk.org Thu Mar 21 19:02:21 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 21 Mar 2024 19:02:21 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 11:04:41 GMT, Maurizio Cimadamore wrote: > This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. > I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. > > As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. src/hotspot/share/prims/unsafe.cpp line 85: > 83: // > 84: // Closing a scope object (cf. scopedMemoryAccess.cpp) can install > 85: // an async handshake on the entry to scoped method. When that happens, Suggestion: // an async exception during a safepoint. When that happens, src/hotspot/share/prims/unsafe.cpp line 94: > 92: // If an async exception handshake were installed in such a safepoint, > 93: // memory access might still occur before the handshake is honored by > 94: // the accessing thread. It's not so much that we can't have safepoints, it's more that we avoid them, since they allow async exception to be installed, which in turn allows memory to be freed. Maybe you could put something like: `As a rule, we disallow further safepoints within a scoped method` at the end of the previous (2nd) paragraph, and then drop the 3rd paragraph. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18429#discussion_r1534482542 PR Review Comment: https://git.openjdk.org/jdk/pull/18429#discussion_r1534487539 From mcimadamore at openjdk.org Thu Mar 21 19:18:34 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 21 Mar 2024 19:18:34 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp [v2] In-Reply-To: References: Message-ID: > This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. > I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. > > As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Update src/hotspot/share/prims/unsafe.cpp Co-authored-by: Jorn Vernee ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18429/files - new: https://git.openjdk.org/jdk/pull/18429/files/42e8c416..1f1c0684 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18429&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18429&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18429.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18429/head:pull/18429 PR: https://git.openjdk.org/jdk/pull/18429 From mcimadamore at openjdk.org Thu Mar 21 19:26:33 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 21 Mar 2024 19:26:33 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp [v3] In-Reply-To: References: Message-ID: > This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. > I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. > > As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: Address review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18429/files - new: https://git.openjdk.org/jdk/pull/18429/files/1f1c0684..35246ded Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18429&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18429&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18429.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18429/head:pull/18429 PR: https://git.openjdk.org/jdk/pull/18429 From mcimadamore at openjdk.org Thu Mar 21 19:26:33 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 21 Mar 2024 19:26:33 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp [v3] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 18:57:23 GMT, Jorn Vernee wrote: >> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/share/prims/unsafe.cpp line 94: > >> 92: // If an async exception handshake were installed in such a safepoint, >> 93: // memory access might still occur before the handshake is honored by >> 94: // the accessing thread. > > It's not so much that we can't have safepoints, it's more that we avoid them, since they allow async exception to be installed, which in turn allows memory to be freed. > > Maybe you could put something like: `As a rule, we disallow further safepoints within a scoped method` at the end of the previous (2nd) paragraph, and then drop the 3rd paragraph. I've adopted your `As a rule` suggestion, but kept the para, as IMHO the structure is important: 1. first para talks about on-entry 2. second para talks about safe point restrictions in general 3. third para adds detail on the fact that no safe point means "no native state" ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18429#discussion_r1534537318 From jvernee at openjdk.org Thu Mar 21 19:29:21 2024 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 21 Mar 2024 19:29:21 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp [v3] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 19:26:33 GMT, Maurizio Cimadamore wrote: >> This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. >> I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. >> >> As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Marked as reviewed by jvernee (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18429#pullrequestreview-1953210077 From cslucas at openjdk.org Thu Mar 21 22:05:32 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 21 Mar 2024 22:05:32 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> References: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> Message-ID: On Tue, 19 Mar 2024 23:18:55 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Catching up with master > - Fix broken build. > - Merge with origin/master > - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java > > Co-authored-by: Andrey Turbanov > - Ammend previous fix & add repro tests. > - Fix to prevent reducing already reduced Phi > - Fix to prevent creating NULL ConNKlass constants. > - Refrain from RAM of arrays and Phis controlled by Loop nodes. > - Fix typo in test. > - Fix build after merge. > - ... and 2 more: https://git.openjdk.org/jdk/compare/7231fd78...620f9da8 @vnkozlov @TobiHartmann - Can you please run your tests again on this if that's not too much to ask? Can someone please take a look at these changes? /cc @iwanowww ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2013904877 From ioi.lam at oracle.com Thu Mar 21 22:51:22 2024 From: ioi.lam at oracle.com (ioi.lam at oracle.com) Date: Thu, 21 Mar 2024 15:51:22 -0700 Subject: RFC: JEP: CDS Archived Object Streaming In-Reply-To: <7602C9F9-5381-4E05-BB99-EB0F9D12FCCA@oracle.com> References: <7602C9F9-5381-4E05-BB99-EB0F9D12FCCA@oracle.com> Message-ID: <79590c60-5571-4edb-be00-7553e24e2493@oracle.com> Hi Erik, Thanks for posting the draft JEP. I am in support of this change, as it - enables CDS archived heap for ZGC (and possibly other collectors that do not support the CDS archived heap today) - gives us the ability to evaluate the existing mmap approach vs the new streaming approach. As mentioned in the JEP, a full evaluation may look at different areas such as scalability and maintainability. In Project Leyden [1], we are likely to expand the use of archived heap objects. I think development in Leyden will guide us in further developing these two approaches. We may eventually come to a conclusion to have a single approach (either by picking one of them, or by combining the aspects of the two approaches in some way). There are a lot of unknowns going forward, but I think this is a very exciting start. Again, thank you Erik for moving this forward. - Ioi [1] https://openjdk.org/projects/leyden/ On 3/20/24 2:35 AM, Erik Osterlund wrote: > Hi, > > I have written a draft JEP for a new GC agnostic CDS object archiving > mechanism that we intend to use at least in ZGC. > JEP description is available here: > https://bugs.openjdk.org/browse/JDK-8326035 > > Comments and feedback are welcome. > > Thanks, > /Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvn at openjdk.org Thu Mar 21 23:00:25 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 21 Mar 2024 23:00:25 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: References: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> Message-ID: On Thu, 21 Mar 2024 22:02:54 GMT, Cesar Soares Lucas wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Catching up with master >> - Fix broken build. >> - Merge with origin/master >> - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java >> >> Co-authored-by: Andrey Turbanov >> - Ammend previous fix & add repro tests. >> - Fix to prevent reducing already reduced Phi >> - Fix to prevent creating NULL ConNKlass constants. >> - Refrain from RAM of arrays and Phis controlled by Loop nodes. >> - Fix typo in test. >> - Fix build after merge. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/7231fd78...620f9da8 > > @vnkozlov @TobiHartmann - Can you please run your tests again on this if that's not too much to ask? > > Can someone please take a look at these changes? /cc @iwanowww @JohnTortugo I thought you will also enable `ReduceAllocationMerges` by default. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2013991787 From sspitsyn at openjdk.org Fri Mar 22 00:32:33 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 22 Mar 2024 00:32:33 GMT Subject: Integrated: 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake In-Reply-To: References: Message-ID: On Fri, 15 Mar 2024 20:14:22 GMT, Serguei Spitsyn wrote: > The `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI functions `GetOwnedMonitorInfo` and `GetOwnedMonitorStackDepthInfo` on the base of `JvmtiHandshake and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 This pull request has now been integrated. Changeset: 4d36c4ad Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/4d36c4adcc47630ddc7149c48c06dc8a93c1be5c Stats: 129 lines in 4 files changed: 33 ins; 74 del; 22 mod 8328285: GetOwnedMonitorInfo functions should use JvmtiHandshake Reviewed-by: pchilanomate, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/18332 From cslucas at openjdk.org Fri Mar 22 00:35:28 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 22 Mar 2024 00:35:28 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: References: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> Message-ID: On Thu, 21 Mar 2024 22:57:23 GMT, Vladimir Kozlov wrote: >> @vnkozlov @TobiHartmann - Can you please run your tests again on this if that's not too much to ask? >> >> Can someone please take a look at these changes? /cc @iwanowww > > @JohnTortugo I thought you will also enable `ReduceAllocationMerges` by default. @vnkozlov - turns out I made a confusion earlier. The optimization was disabled only in the jdk22 repo; it's currently enabled in JDK Tip. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2014105700 From cslucas at openjdk.org Fri Mar 22 00:35:28 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 22 Mar 2024 00:35:28 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v9] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > # Tier-1 Testing status > > | | Win | Mac | Linux | > |----------|---------|---------|---------| > | ARM64 | ? | ? | | > | ARM32 | n/a | n/a | | > | x86 | | | ? | > | x64 | ? | ? | ? | > | PPC64 | n/a | n/a | | > | S390x | n/a | n/a | | > | RiscV | n/a | n/a | ? | Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Catching up with changes in master - Catching up with origin/master - Catch up with origin/master - Merge with origin/master - Fix build, copyright dates, m4 files. - Fix merge - Catch up with master branch. Merge remote-tracking branch 'origin/master' into reuse-macroasm - Some inst_mark fixes; Catch up with master. - Catch up with changes on master - Reuse same C2_MacroAssembler object to emit instructions. ------------- Changes: https://git.openjdk.org/jdk/pull/16484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=08 Stats: 2152 lines in 60 files changed: 136 ins; 431 del; 1585 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From kvn at openjdk.org Fri Mar 22 00:56:32 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 00:56:32 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> References: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> Message-ID: <3lNNenuz-e6q84ZJmTxQ425Z31mNXp5e5PqhhfXSUPs=.59818092-73c1-4498-a76c-d621a9420e46@github.com> On Tue, 19 Mar 2024 23:18:55 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Catching up with master > - Fix broken build. > - Merge with origin/master > - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java > > Co-authored-by: Andrey Turbanov > - Ammend previous fix & add repro tests. > - Fix to prevent reducing already reduced Phi > - Fix to prevent creating NULL ConNKlass constants. > - Refrain from RAM of arrays and Phis controlled by Loop nodes. > - Fix typo in test. > - Fix build after merge. > - ... and 2 more: https://git.openjdk.org/jdk/compare/7231fd78...620f9da8 Okay. I submitted our testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2014123562 From sspitsyn at openjdk.org Fri Mar 22 02:50:27 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 22 Mar 2024 02:50:27 GMT Subject: RFR: 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake Message-ID: The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI `GetCurrentContendedMonitor` function on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. Testing: - Ran mach5 tiers 1-6 ------------- Commit messages: - 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake Changes: https://git.openjdk.org/jdk/pull/18444/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18444&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328758 Stats: 57 lines in 3 files changed: 13 ins; 32 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/18444.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18444/head:pull/18444 PR: https://git.openjdk.org/jdk/pull/18444 From dholmes at openjdk.org Fri Mar 22 03:26:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Fri, 22 Mar 2024 03:26:21 GMT Subject: RFR: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp [v3] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 19:26:33 GMT, Maurizio Cimadamore wrote: >> This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. >> I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. >> >> As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. > > Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments That all reads fine to me too. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18429#pullrequestreview-1953872252 From aph at openjdk.org Fri Mar 22 08:54:23 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 08:54:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> Message-ID: On Thu, 21 Mar 2024 15:43:28 GMT, Magnus Ihse Bursie wrote: >>> > The problem I see is that J. Random Java User has no way to know if SLEEF is making their program faster without running benchmarks. They'll put SLEEF somewhere and hope that Java uses it. >>> >>> Please kindly correct me if I misunderstood your points. Seems the safest solution to address your above concerns is to integrate the sleef source into jdk? Lack of sleef at either build time or runtime will make the user's code fall back to java implementation. >> >> Exactly, yes. That's why we've integrated the source code of many other libraries we depend on into the JDK. It's the only way to get the reliability our users expect. > > @theRealAph Are you saying that bundling the source code of libsleef is a hard requirement from your side to accept this code into the JDK? > > I'm not against it, I just want to understand what we're talking about here. > > In general, adding new libraries to OpenJDK will require a legal process in Oracle, which may (or may not) take some amount of time T, where T is larger than you'd wish for. > > So I guess we can either: > 1) wait for libsleef source code to become a part of OpenJDK, and then integrate this PR. > 2) integrate this PR optimistically, and in the background start a process of trying to get libsleef into OpenJDK. (Which, of course, can not be 100% guaranteed to happen.) > 3) integrate this PR as is, and give up any idea of bundling libsleef. > > I also believe there is a fourth option, but that too seems like it has legal implications that needs to be checked: > > 4) if the libsleef dynamic library is found on the system during build time, bundle a copy of the dll with the built JDK. (Similar to how was done with freetype on Windows before.). And, optionally, provide an option for configure to require libsleef to be present, so the build fails if no libsleef can be found and bundled. (Leaving open as to if this should be default or not.) > @magicus @theRealAph Thanks for discussion. > > In the short term, I'd build a shim library during the default standard JDK build that does not need SLEEF at build time. Unless weo do that, because SLEEF isn't on anyone's builders, it won't be used. > > I agree it's give user more chance to leverage the sleef when running, but I wonder if it's necessary to do that. As we have a long term solution, and the chance that end user lacks sleef library in their environment is much higher than release engineer lacks sleef library in their environment. That situation leads to my nightmare, in which some OpenJDK builds work with vector support, and some don't, and there's no way to find out which unless you try it, in which case there is no warning, your programs just run slowly. Even if you've installed SLEEF. That would be bad for our users, bad for OpenJDK, and bad for Java. > But it's not harm to do so. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014634593 From ihse at openjdk.org Fri Mar 22 10:25:25 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 22 Mar 2024 10:25:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 16:58:06 GMT, Andrew Haley wrote: > What is the relevance of SVE support at build time? Should it matter what the build machine is? My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014782528 From rehn at openjdk.org Fri Mar 22 10:33:23 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Mar 2024 10:33:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 22 Mar 2024 10:22:54 GMT, Magnus Ihse Bursie wrote: > > What is the relevance of SVE support at build time? Should it matter what the build machine is? > > My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? Yes, it should be in gcc 8 and above, and [we seem to require gcc 10](https://openjdk.org/groups/build/doc/building.html). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014793399 From ihse at openjdk.org Fri Mar 22 10:33:24 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 22 Mar 2024 10:33:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <_XOfFT147tFIxCtHn7YXCxdeA18fxWHJUPTsaIa3W5k=.936f1797-88fc-408b-91ae-1471a68b0c1f@github.com> Message-ID: On Fri, 22 Mar 2024 08:51:47 GMT, Andrew Haley wrote: >> @theRealAph Are you saying that bundling the source code of libsleef is a hard requirement from your side to accept this code into the JDK? >> >> I'm not against it, I just want to understand what we're talking about here. >> >> In general, adding new libraries to OpenJDK will require a legal process in Oracle, which may (or may not) take some amount of time T, where T is larger than you'd wish for. >> >> So I guess we can either: >> 1) wait for libsleef source code to become a part of OpenJDK, and then integrate this PR. >> 2) integrate this PR optimistically, and in the background start a process of trying to get libsleef into OpenJDK. (Which, of course, can not be 100% guaranteed to happen.) >> 3) integrate this PR as is, and give up any idea of bundling libsleef. >> >> I also believe there is a fourth option, but that too seems like it has legal implications that needs to be checked: >> >> 4) if the libsleef dynamic library is found on the system during build time, bundle a copy of the dll with the built JDK. (Similar to how was done with freetype on Windows before.). And, optionally, provide an option for configure to require libsleef to be present, so the build fails if no libsleef can be found and bundled. (Leaving open as to if this should be default or not.) > >> @magicus @theRealAph Thanks for discussion. >> > In the short term, I'd build a shim library during the default standard JDK build that does not need SLEEF at build time. Unless weo do that, because SLEEF isn't on anyone's builders, it won't be used. >> >> I agree it's give user more chance to leverage the sleef when running, but I wonder if it's necessary to do that. As we have a long term solution, and the chance that end user lacks sleef library in their environment is much higher than release engineer lacks sleef library in their environment. > > That situation leads to my nightmare, in which some OpenJDK builds work with vector support, and some don't, and there's no way to find out which unless you try it, in which case there is no warning, your programs just run slowly. Even if you've installed SLEEF. That would be bad for our users, bad for OpenJDK, and bad for Java. > >> But it's not harm to do so. @theRealAph An alternative to building libsleef is to *require* it as a build prerequisite on linux/aarch64. That way you will at least not end up in the situation you fear, where no build environment for the popular distributions have libsleef during build time, and in practice no end user will benefit either. Otoh, doing that will require us to send out the message for downstream builders to prepare, and waiting for them to prepare, and the question is if this is faster than bundling the source code... And if we're going the latter route anyway, then this approach will have been pointless. So, I guess, this is a worse solution. Just wanted to point out that it exists. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014795185 From mdoerr at openjdk.org Fri Mar 22 10:39:23 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 22 Mar 2024 10:39:23 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v5] In-Reply-To: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> References: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> Message-ID: On Wed, 20 Mar 2024 16:32:50 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Address code cleanup review comments LGTM. Please make sure to have substantial test coverage with AND without SuperwordUseVX enabled. The assembler instructions and their usage need to be reviewed by somebody else. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18162#pullrequestreview-1954477673 From ihse at openjdk.org Fri Mar 22 11:24:25 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 22 Mar 2024 11:24:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <_i8TKTamrl3rq7z7C-C7eIS-Bqv7Rf_xF2PSMU-LuvI=.b0873045-9efc-4f5c-bed8-0a1e9f7554b8@github.com> On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes Some updates: I had a look at the source code for libsleef, and frankly, to integrate building of libsleef into the OpenJDK build is going to be a major PITA. :-( They have a cmake based build system, which generates a ton of native build tools, which preprocess source code and generates output artifacts (like sleef.h). I guess we *could* do it, but I am terrified about trying to get that to work, especially for cross-compilation. So after seeing this, I think the better solution is to require a pre-compiled libsleef present in the system. (What I just called the "worse solution" an hour ago...). I guess we can make the requirement of libsleef the default value, but allow the user to override this to skip libsleef. At least then it is an active choice to disable libsleef for your build. Or, we could do something like how we used to do for freetype: have a --with-libsleef-src that points to a checked out version of libsleef, and let configure build it. In this case we could even build it as a static archive, so there'd be no need for libvectormath.so to have any external dependencies. In any case, we are going to have to consider carefully how to proceed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014876520 From mdoerr at openjdk.org Fri Mar 22 11:31:28 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 22 Mar 2024 11:31:28 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v9] In-Reply-To: References: Message-ID: <7XyaAxApFNWbmO0yXfUF8lo4v-Xgyfg90RFrwE-lNOM=.083fce75-0f71-40ec-b208-30082a0e5b91@github.com> On Fri, 22 Mar 2024 00:35:28 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Tier-1 Testing status >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | ? | ? | | >> | ARM32 | n/a | n/a | | >> | x86 | | | ? | >> | x64 | ? | ? | ? | >> | PPC64 | n/a | n/a | | >> | S390x | n/a | n/a | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Catching up with changes in master > - Catching up with origin/master > - Catch up with origin/master > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. Thanks for the update! The PPC64 parts look good to me. Other platforms need be reviewed by other people (maintainers are listed here: https://wiki.openjdk.org/display/HotSpot/Ports). ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16484#pullrequestreview-1954565932 From aph at openjdk.org Fri Mar 22 12:34:24 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 12:34:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes > > > What is the relevance of SVE support at build time? Should it matter what the build machine is? > > > > > > My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > > I hope not, otherwise the resulting binary won't run on most systems. I can write you a redirect library that doesn't need any magic compiler options. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014979694 From aph at openjdk.org Fri Mar 22 12:34:24 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 12:34:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> Message-ID: <9H3PBcqWAZevaB5KU1_Lh5ptj1f365-iLiIyaBMnNyE=.ce6f80fb-4fe8-4cf4-b58e-456238637b1f@github.com> On Fri, 22 Mar 2024 12:31:01 GMT, Andrew Haley wrote: > > > What is the relevance of SVE support at build time? Should it matter what the build machine is? > > > > > > My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > > I hope not, otherwise the resulting binary won't run on most systems. Ah, it'll only be the redirect library that's compiled with `-march=armv8-a+sve` Forget that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2014982742 From ihse at openjdk.org Fri Mar 22 12:44:23 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 22 Mar 2024 12:44:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 22 Mar 2024 10:29:36 GMT, Robbin Ehn wrote: >>> What is the relevance of SVE support at build time? Should it matter what the build machine is? >> >> My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > >> > What is the relevance of SVE support at build time? Should it matter what the build machine is? >> >> My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > > Yes, it should be in gcc 8 and above, and [we seem to require gcc 10](https://openjdk.org/groups/build/doc/building.html). @robehn Great! I guess that option goes hand-in-hand with arm_sve.h? If so, we can remove the complication of the SVE check completely. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2015010158 From ihse at openjdk.org Fri Mar 22 12:44:23 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 22 Mar 2024 12:44:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <9H3PBcqWAZevaB5KU1_Lh5ptj1f365-iLiIyaBMnNyE=.ce6f80fb-4fe8-4cf4-b58e-456238637b1f@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> <9H3PBcqWAZevaB5KU1_Lh5ptj1f365-iLiIyaBMnNyE=.ce6f80fb-4fe8-4cf4-b58e-456238637b1f@github.com> Message-ID: On Fri, 22 Mar 2024 12:31:53 GMT, Andrew Haley wrote: > Ah, it'll only be the redirect library that's compiled with -march=armv8-a+sve Forget that. But that raises an interesting question. What happens if you try to load a library compiled with `-march=armv8-a+sve` on a non-SVE system? Is the ELF flagged to require SVE so it will fail to load? I'm hoping this is the case -- if so, everything will work as it should in this PR, but I honestly don't know. (After spending like 10 years working with building, I still discover things I need to learn...). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2015020065 From sgehwolf at openjdk.org Fri Mar 22 12:56:21 2024 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 22 Mar 2024 12:56:21 GMT Subject: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container In-Reply-To: References: Message-ID: <4igI0UicqIV8pAhAFkxreyNnV9rZiaZZ98JI5TZWpEo=.3a2747f4-bfc2-4e09-9172-2b2f6d9bc2d6@github.com> On Mon, 11 Mar 2024 16:55:36 GMT, Severin Gehwolf wrote: > Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line: > > > [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present > > > This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example: > > > java -XshowSettings:system --version > Operating System Metrics: > Provider: cgroupv1 > System not containerized. > openjdk 23-internal 2024-09-17 > OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) > OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) > > > The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed. > > Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system. > > Testing: > > - [x] GHA (risc-v failure seems infra related) > - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests) > - [x] Some manual testing using cri-o > > Thoughts? Anyone willing to review this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2015043712 From stuefe at openjdk.org Fri Mar 22 14:40:34 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 22 Mar 2024 14:40:34 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> Message-ID: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> On Wed, 20 Mar 2024 17:06:06 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Include os.inline.hpp Hi @jdksjolen, Nice work! The general direction is good. I've glanced over parts of the change. Unfortunately, I won't be able to provide more feedback until after Easter because of a lack of time. Cheers, Thomas src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 30: > 28: #include "utilities/growableArray.hpp" > 29: #include "utilities/nativeCallStack.hpp" > 30: Here, I would love it if we had smaller than pointer-sized callstack IDs. The simplest way would be to copy callstacks into a growable area, and use their numerical indices as ID. We can get by with 16 bits, or 32 bits if we think 64K callstacks are not enough (they should be enough). Those can then be combined very efficiently with MEMFLAG and VMAState in the VMATree nodes. The hashmap for reverse id lookup could be a standard hashmap of key=callstack, value=ID. As a future possible improvement, that could then replace the MallocSiteTable which does almost the same thing. (only have to avoid using malloc then, because of recursivities). src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 84: > 82: return *_stack; > 83: } > 84: }; What is the point of the StackIndex class? src/hotspot/share/nmt/nmtTreap.hpp line 45: > 43: // the tree which is O(log n) so we are safe from stack overflow. > 44: > 45: // TreapNode has LEQ nodes on the left, GT nodes on the right. Simplification: why do we need a separate template parameter for the comparison? Why not just use K's operators < > == ? src/hotspot/share/nmt/nmtTreap.hpp line 48: > 46: template > 47: class TreapNode { > 48: template Could we do without the following friend declarations? Also, does the TreapCHeap friend not need to be declared as template? (At least my CDT complains about this) src/hotspot/share/nmt/nmtTreap.hpp line 54: > 52: uint64_t _priority; > 53: K _key; > 54: V _value; Should both key and value be const? After all, you don't want to modify either after the node was added to the tree. src/hotspot/share/nmt/vmatree.hpp line 36: > 34: > 35: > 36: class VMATree { If we do keep a separate comparator template argument (see my remark before, maybe we don't): do we really need this as external function in a C++ file? I think we want this inlined, no? src/hotspot/share/nmt/vmatree.hpp line 46: > 44: > 45: // Each node has some stack and a flag associated with it. > 46: struct Metadata { all members const? src/hotspot/share/nmt/vmatree.hpp line 63: > 61: } > 62: }; > 63: In my original proposal, a `MappingState` (https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L106C7-L114) is the combination of ( MEMFLAGS + VMAstate ). A `MappingStateChange` https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L38-L65 is the combination of two `MappingState`. Its "is_noop()" method compares the combination of both ( MEMFLAGS + VMAstate ) tuples. This `State` here is neither. It contains only the incoming VMAState, and carries the full information only for the outgoing state. Its ` is_noop` only compares VMAState. This renders `is_noop()` pretty useless since it alone is not sufficient when checking if areas for mergeability. Consequently, you then tree-search the adjacent node whenever is_noop is used ( e.g. line 161, `if (is_noop(stA) && Metadata::equals(stA.metadata, leqA_n->val().metadata)) { ` . I would suggest following my proposal here and carrying the full information for both incoming and outgoing states. Advantages would be: - it removes the need for tree searches whenever we check whether neighboring areas can be folded. - it would also make the code easier to read and arguably more robust - it is arguably more logical - semantically, there is no difference between VMAState and the rest of the state information, all of them are part of the total region state that determines if regions are mergeable. If you are worried about data size: all the information we need can still be encoded in a 64-bit value, as suggested above. And compared with a single cmpq. 16 bits for MEMFLAGS, 2 bits for VMAState, and 32 bits are more than enough as a callstack id (in fact, 16 bits should be enough). src/hotspot/share/nmt/vmatree.hpp line 312: > 310: } > 311: > 312: SummaryDiff commit_mapping(size_t from, size_t sz, Metadata& metadata) { So, the `Merge` template parameter only tries to solve the problem that callers don't specify MEMFLAGS on commit or uncommit, right? We can - rethink that; require all callers of commit/uncommit to provide MEMFLAGS too. I know I have been arguing against it, but I like the simpler implementation of NMT. And arguably, it closer reflects `mmap` realities, where reserving and committing don't have the same roles as in hotspot. - don't make merging a topic for the tree. If we want the MEMFLAGS from prior reservations to carry over, just search the tree prior to registering the mapping. Since we need a "tell me the region this pointer points to" functionality anyway, just let's reuse that one. - or, at least, revert to a simple bool or enum that describes merging behavior. Keep it simple and stupid. src/hotspot/share/nmt/vmatree.hpp line 326: > 324: // The releasing API also has no call stack, so we inherit the callstack also. > 325: merge_into.flag = existent.flag; > 326: merge_into.stack_idx = existent.stack_idx; Okay, here we don't have to merge at all, no? I don't see any reason to preserve meta info for a released area. Therefore, I would just register the area with mtNone and empty-callstack-id. ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-1954770433 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535682200 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535683085 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535565061 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535559268 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535560678 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535566453 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535567099 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535579388 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535644027 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535650870 From mcimadamore at openjdk.org Fri Mar 22 15:12:25 2024 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Fri, 22 Mar 2024 15:12:25 GMT Subject: Integrated: 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 11:04:41 GMT, Maurizio Cimadamore wrote: > This small PR is an attempt at improving the comment in the UNSAFE_ENTRY_SCOPED macro. > I found that the comment contained some duplicated claims, did not properly introduced the concepts it referred to, and some of its claims were not explained in full. > > As this is a delicate part of how FFM ensures the "no use after free" guarantee, I think it would be a good idea for this comment to better reflect what the expectations are. This pull request has now been integrated. Changeset: 709410d8 Author: Maurizio Cimadamore URL: https://git.openjdk.org/jdk/commit/709410d8a4ca176c52d9c0abf80ed59eeb6bdaf3 Stats: 20 lines in 1 file changed: 3 ins; 2 del; 15 mod 8328679: Improve comment for UNSAFE_ENTRY_SCOPED in unsafe.cpp Reviewed-by: jvernee, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/18429 From mli at openjdk.org Fri Mar 22 15:25:28 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Mar 2024 15:25:28 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 22 Mar 2024 10:29:36 GMT, Robbin Ehn wrote: > > > What is the relevance of SVE support at build time? Should it matter what the build machine is? > > > > > > My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > > Yes, it should be in gcc 8 and above, and [we seem to require gcc 10](https://openjdk.org/groups/build/doc/building.html). For ACLE, it's supported in [10.1](https://gcc.gnu.org/gcc-10/changes.html) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2015334896 From mli at openjdk.org Fri Mar 22 15:29:24 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Mar 2024 15:29:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes > Some updates: I had a look at the source code for libsleef, and frankly, to integrate building of libsleef into the OpenJDK build is going to be a major PITA. :-( They have a cmake based build system, which generates a ton of native build tools, which preprocess source code and generates output artifacts (like sleef.h). > > I guess we _could_ do it, but I am terrified about trying to get that to work, especially for cross-compilation. > > So after seeing this, I think the better solution is to require a pre-compiled libsleef present in the system. (What I just called the "worse solution" an hour ago...). I guess we can make the requirement of libsleef the default value, but allow the user to override this to skip libsleef. At least then it is an active choice to disable libsleef for your build. > > Or, we could do something like how we used to do for freetype: have a --with-libsleef-src that points to a checked out version of libsleef, and let configure build it. In this case we could even build it as a static archive, so there'd be no need for libvectormath.so to have any external dependencies. > > In any case, we are going to have to consider carefully how to proceed. No worry, seems works for me, I just had a very draft version tested locally. we just need to get the final inline versions of sleef generated by sleef cmake install. A question about the licensing: how long does it take to finish the legal process of the sleef licence? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2015341870 From mli at openjdk.org Fri Mar 22 15:33:26 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Mar 2024 15:33:26 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> <9H3PBcqWAZevaB5KU1_Lh5ptj1f365-iLiIyaBMnNyE=.ce6f80fb-4fe8-4cf4-b58e-456238637b1f@github.com> Message-ID: <6hpEaT7qf_Mo8eyeDiu6_UvBy0J4TBkdB2aETVt0Axw=.4e2dbd36-d14a-4e53-b6ea-88e60594a139@github.com> On Fri, 22 Mar 2024 12:42:04 GMT, Magnus Ihse Bursie wrote: > > Ah, it'll only be the redirect library that's compiled with -march=armv8-a+sve Forget that. > > But that raises an interesting question. What happens if you try to load a library compiled with `-march=armv8-a+sve` on a non-SVE system? Is the ELF flagged to require SVE so it will fail to load? I'm hoping this is the case -- if so, everything will work as it should in this PR, but I honestly don't know. (After spending like 10 years working with building, I still discover things I need to learn...). I think we can handle it, when a jdk built with sve support runs on a non-sve machine, the sve related code will not be executed with the protection of UseSVE vm flag which is detected at runtime startup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2015348458 From mli at openjdk.org Fri Mar 22 15:36:24 2024 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Mar 2024 15:36:24 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <5izByUuenndxyvTBY3qqTHQdzNYbpedF1Z5tAmX7ub8=.bc0a146c-0271-4ef0-9108-77d5f6105d14@github.com> On Fri, 22 Mar 2024 10:29:36 GMT, Robbin Ehn wrote: >>> What is the relevance of SVE support at build time? Should it matter what the build machine is? >> >> My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > >> > What is the relevance of SVE support at build time? Should it matter what the build machine is? >> >> My understanding is that we need a compiler that supports `-march=armv8-a+sve`; otherwise we can't build the redirect library properly. But maybe that is a misunderstanding? > > Yes, it should be in gcc 8 and above, and [we seem to require gcc 10](https://openjdk.org/groups/build/doc/building.html). > @robehn Great! I guess that option goes hand-in-hand with arm_sve.h? If so, we can remove the complication of the SVE check completely. If we all agree to optimistically integrate the current pr (needs both build time + run time sleef support) first, I can check related things, and do necessary modification/improvement. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2015353690 From sjayagond at openjdk.org Fri Mar 22 15:40:25 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Fri, 22 Mar 2024 15:40:25 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v5] In-Reply-To: References: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> Message-ID: On Fri, 22 Mar 2024 10:36:52 GMT, Martin Doerr wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Address code cleanup review comments > > LGTM. Please make sure to have substantial test coverage with AND without SuperwordUseVX enabled. The assembler instructions and their usage need to be reviewed by somebody else. @TheRealMDoerr I have tested with and without **SuperwordUseVX** enabled, assembler instructions review is being done by @offamitkumar. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2015362677 From jsjolen at openjdk.org Fri Mar 22 16:25:24 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 22 Mar 2024 16:25:24 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: <3Qf4VdwSgFqSREMiqGs0bCQhjjrF8OXe4v-QyxMiUI0=.779cb7cd-6cc6-4adf-8b32-431e95a4e96a@github.com> On Fri, 22 Mar 2024 14:29:31 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 84: > >> 82: return *_stack; >> 83: } >> 84: }; > > What is the point of the StackIndex class? The `StackIndex` used to be a 4 byte encoding which required the `NativeCallStackStorage` to be able to be dereferenced. This method should at the very least be deleted and replaced with `NativeCallStackStorage::get`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535848809 From jsjolen at openjdk.org Fri Mar 22 16:28:25 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 22 Mar 2024 16:28:25 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: <0G_oRg-MB6aRKXpHJ4ca8lIQ72ZhsA2WBujtJ8BQaD0=.bbbc53c2-cb49-4051-998e-e9e48e4ea516@github.com> On Fri, 22 Mar 2024 13:23:22 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/nmtTreap.hpp line 45: > >> 43: // the tree which is O(log n) so we are safe from stack overflow. >> 44: >> 45: // TreapNode has LEQ nodes on the left, GT nodes on the right. > > Simplification: why do we need a separate template parameter for the comparison? Why not just use K's operators < > == ? I tend to not use operator-overloading so I didn't think of that as a possibility. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535855082 From jsjolen at openjdk.org Fri Mar 22 16:32:29 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 22 Mar 2024 16:32:29 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: On Fri, 22 Mar 2024 13:34:34 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/vmatree.hpp line 63: > >> 61: } >> 62: }; >> 63: > > In my original proposal, a `MappingState` (https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L106C7-L114) is the combination of ( MEMFLAGS + VMAstate ). > > A `MappingStateChange` https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L38-L65 is the combination of two `MappingState`. Its "is_noop()" method compares the combination of both ( MEMFLAGS + VMAstate ) tuples. > > This `State` here is neither. It contains only the incoming VMAState, and carries the full information only for the outgoing state. Its ` is_noop` only compares VMAState. This renders `is_noop()` pretty useless since it alone is not sufficient when checking if areas for mergeability. Consequently, you then tree-search the adjacent node whenever is_noop is used ( e.g. line 161, `if (is_noop(stA) && Metadata::equals(stA.metadata, leqA_n->val().metadata)) { ` . > > I would suggest following my proposal here and carrying the full information for both incoming and outgoing states. Advantages would be: > - it removes the need for tree searches whenever we check whether neighboring areas can be folded. > - it would also make the code easier to read and arguably more robust > - it is arguably more logical - semantically, there is no difference between VMAState and the rest of the state information, all of them are part of the total region state that determines if regions are mergeable. > > If you are worried about data size: all the information we need can still be encoded in a 64-bit value, as suggested above. And compared with a single cmpq. 16 bits for MEMFLAGS, 2 bits for VMAState, and 32 bits are more than enough as a callstack id (in fact, 16 bits should be enough). Hi, just to be clear: We don't perform a tree search whenever `is_noop` is used, the `leqA_n` is also present in your draft. Still, I think you might be right that having the full state for both in and out is the best way to go, I'll try it out as a simplification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535860353 From jsjolen at openjdk.org Fri Mar 22 16:35:25 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 22 Mar 2024 16:35:25 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: <0idUFcJmBBqCsL1MSYHYcCtD3Z9gicYXbNuASLBEx_k=.a18cfb80-75c0-4c8e-8a16-00d7f24c6e1d@github.com> On Fri, 22 Mar 2024 14:11:55 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/vmatree.hpp line 326: > >> 324: // The releasing API also has no call stack, so we inherit the callstack also. >> 325: merge_into.flag = existent.flag; >> 326: merge_into.stack_idx = existent.stack_idx; > > Okay, here we don't have to merge at all, no? I don't see any reason to preserve meta info for a released area. > > Therefore, I would just register the area with mtNone and empty-callstack-id. There really was a bug associated with not doing this which caused the invariants of in/out nodes to break, there's a test for it :-). It has to do with the `is_noop() && equals` check. Let's see how this changes as I try out following your idea a bit more. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535864052 From jsjolen at openjdk.org Fri Mar 22 16:40:28 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 22 Mar 2024 16:40:28 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: <7u3imUh6-qb_wLdyZ4mn5SfnEOkxyFEQ20O0fb6WJj0=.3179edcb-0340-4d50-a674-c18128cc2e2f@github.com> On Fri, 22 Mar 2024 14:28:51 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/nmtNativeCallStackStorage.hpp line 30: > >> 28: #include "utilities/growableArray.hpp" >> 29: #include "utilities/nativeCallStack.hpp" >> 30: > > Here, I would love it if we had smaller than pointer-sized callstack IDs. > > The simplest way would be to copy callstacks into a growable area, and use their numerical indices as ID. We can get by with 16 bits, or 32 bits if we think 64K callstacks are not enough (they should be enough). > > Those can then be combined very efficiently with MEMFLAG and VMAState in the VMATree nodes. > > The hashmap for reverse id lookup could be a standard hashmap of key=callstack, value=ID. > > As a future possible improvement, that could then replace the MallocSiteTable which does almost the same thing. (only have to avoid using malloc then, because of recursivities). I still don't get how a growable array is supposed to work for open addressing while having the indices staying immutable :-). How does this work: ```c++ HTable ht; // Open-addressed GrowableArray with linear probing Index oldidx = ht.put(4); // A bunch of puts, leading to a resize of the array Index newidx = ht.put(4); assert(oldidx == newidx, "how is this ensured?"); How does this work? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1535870002 From kvn at openjdk.org Fri Mar 22 17:10:26 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 17:10:26 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> References: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> Message-ID: On Tue, 19 Mar 2024 23:18:55 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Catching up with master > - Fix broken build. > - Merge with origin/master > - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java > > Co-authored-by: Andrey Turbanov > - Ammend previous fix & add repro tests. > - Fix to prevent reducing already reduced Phi > - Fix to prevent creating NULL ConNKlass constants. > - Refrain from RAM of arrays and Phis controlled by Loop nodes. > - Fix typo in test. > - Fix build after merge. > - ... and 2 more: https://git.openjdk.org/jdk/compare/7231fd78...620f9da8 My tier1-7 testing passed. You need second review. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15825#pullrequestreview-1955334850 From cslucas at openjdk.org Fri Mar 22 17:46:27 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 22 Mar 2024 17:46:27 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: <8ZkfLag-QKsU8stjY8amUxO_0-yTr8SlXOSh6nhQ1PM=.6efb1eef-b8a9-437a-9a1f-f5aa74009f0f@github.com> References: <8ZkfLag-QKsU8stjY8amUxO_0-yTr8SlXOSh6nhQ1PM=.6efb1eef-b8a9-437a-9a1f-f5aa74009f0f@github.com> Message-ID: On Wed, 20 Dec 2023 20:01:43 GMT, Vladimir Kozlov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > src/hotspot/cpu/x86/assembler_x86.cpp line 4248: > >> 4246: void Assembler::vpermb(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { >> 4247: assert(VM_Version::supports_avx512_vbmi(), ""); >> 4248: InstructionMark im(this); > > May be add short comment why you need `InstructionMark` in these instructions but not in others. I expanded the comment in declaration of InstructionMark class with more information. I'm pushing the changes in a few minutes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1535969395 From aph at openjdk.org Fri Mar 22 17:58:24 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 17:58:24 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 16:55:25 GMT, Chris Plummer wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> JDK-8180450: secondary_super_cache does not scale well > > src/hotspot/share/runtime/globals.hpp line 2005: > >> 2003: \ >> 2004: product(bool, VerifySecondarySupers, false, DIAGNOSTIC, \ >> 2005: "Use secondary super cache during subtype checks") \ > > Did you mean "Verify secondary super cache..."? No, this option checks that the new (hashed) algorithm matches the old one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1535986568 From kvn at openjdk.org Fri Mar 22 18:01:35 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 18:01:35 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v9] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 00:35:28 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Tier-1 Testing status >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | ? | ? | | >> | ARM32 | n/a | n/a | | >> | x86 | | | ? | >> | x64 | ? | ? | ? | >> | PPC64 | n/a | n/a | | >> | S390x | n/a | n/a | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Catching up with changes in master > - Catching up with origin/master > - Catch up with origin/master > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. GHA shows issue with Aarch64 builds on MacOS and Windows: d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\gensrc\adfiles\ad_aarch64.cpp(24257): error C2653: 'CompiledStaticCall': is not a class or namespace name d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\gensrc\adfiles\ad_aarch64.cpp(24257): error C3861: 'emit_to_interp_stub': identifier not found ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2015632011 From kvn at openjdk.org Fri Mar 22 18:19:27 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 18:19:27 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: <15jj9hbfg10JDjVjqRdIEVUew7w-bJlUmFU514oS658=.7502b8b3-bc18-4eda-a4a0-7a328d8dc504@github.com> On Thu, 21 Mar 2024 14:42:55 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well src/hotspot/share/opto/memnode.cpp line 2451: > 2449: } > 2450: > 2451: if (tkls != nullptr && !UseSecondarySuperCache May be add comment what are you doing here. Is it switching off **current** implementation of secondary super cache optimization or both: current and new? src/hotspot/share/runtime/globals.hpp line 1999: > 1997: \ > 1998: product(bool, UseSecondarySuperCache, true, DIAGNOSTIC, \ > 1999: "Use secondary super cache during subtype checks") \ Is it guard for current implementation or both? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1535998973 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1535999811 From kvn at openjdk.org Fri Mar 22 18:19:28 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 18:19:28 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 17:55:57 GMT, Andrew Haley wrote: >> src/hotspot/share/runtime/globals.hpp line 2005: >> >>> 2003: \ >>> 2004: product(bool, VerifySecondarySupers, false, DIAGNOSTIC, \ >>> 2005: "Use secondary super cache during subtype checks") \ >> >> Did you mean "Verify secondary super cache..."? > > No, this option checks that the new (hashed) algorithm matches the old one. So what is difference between `UseSecondarySuperCache` and `VerifySecondarySupers`? The description is the same and only default values are different. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1535994212 From cslucas at openjdk.org Fri Mar 22 18:19:39 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 22 Mar 2024 18:19:39 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v10] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix AArch64 build & improve comment about InstructionMark ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16484/files - new: https://git.openjdk.org/jdk/pull/16484/files/aa80c801..e763b089 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=08-09 Stats: 4 lines in 3 files changed: 1 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From cslucas at openjdk.org Fri Mar 22 18:19:42 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 22 Mar 2024 18:19:42 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v9] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 17:58:36 GMT, Vladimir Kozlov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - Catching up with changes in master >> - Catching up with origin/master >> - Catch up with origin/master >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > GHA shows issue with Aarch64 builds on MacOS and Windows: > > d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\gensrc\adfiles\ad_aarch64.cpp(24257): error C2653: 'CompiledStaticCall': is not a class or namespace name > d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\gensrc\adfiles\ad_aarch64.cpp(24257): error C3861: 'emit_to_interp_stub': identifier not found @vnkozlov - I was working on that. It was because of a typo. Now it's fixed. Can I please get more reviews on this? Pinging port maintainer: /cc @theRealAph @nick-arm @bulasevich @snazarkin @offamitkumar @RealLucy ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2015657375 From kvn at openjdk.org Fri Mar 22 18:23:23 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 18:23:23 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Thu, 21 Mar 2024 14:42:55 GMT, Andrew Haley wrote: >> This PR is a redesign of subtype checking. >> >> The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. >> >> So what's changed, so that the old design should be replaced? >> >> Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. >> >> The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. >> >> Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. >> >> However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. >> >> The solution >> ------------ >> >> We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. >> >> We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. >> >> It works like th... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > JDK-8180450: secondary_super_cache does not scale well > I would have liked to save the hashed interfaces list in the CDS archive file, but there are test cases for the Serviceability Agent that assume the interfaces are in the order in which they are declared. I think the tests are probably just wrong about this. For the same reason, I have to make copies of the interfaces array and sort the copies, rather than just using the same array at runtime. May be file follow up RFE for saving hashed interfaces list in CDS which will also fix those tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2015666795 From aph at openjdk.org Fri Mar 22 18:39:32 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 18:39:32 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:20:22 GMT, Vladimir Kozlov wrote: > > I would have liked to save the hashed interfaces list in the CDS archive file, but there are test cases for the Serviceability Agent that assume the interfaces are in the order in which they are declared. I think the tests are probably just wrong about this. For the same reason, I have to make copies of the interfaces array and sort the copies, rather than just using the same array at runtime. > > May be file follow up RFE for saving hashed interfaces list in CDS which will also fix those tests. That sounds good. > src/hotspot/share/opto/memnode.cpp line 2451: > >> 2449: } >> 2450: >> 2451: if (tkls != nullptr && !UseSecondarySuperCache > > May be add comment what are you doing here. > Is it switching off **current** implementation of secondary super cache optimization or both: current and new? All it does is switch of the **current** implementation, and only in C2. > src/hotspot/share/runtime/globals.hpp line 1999: > >> 1997: \ >> 1998: product(bool, UseSecondarySuperCache, true, DIAGNOSTIC, \ >> 1999: "Use secondary super cache during subtype checks") \ > > Is it guard for current implementation or both? Just the current. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2015693658 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1536031094 PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1536031476 From aph at openjdk.org Fri Mar 22 18:39:32 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 18:39:32 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:04:06 GMT, Vladimir Kozlov wrote: >> No, this option checks that the new (hashed) algorithm matches the old one. > > So what is difference between `UseSecondarySuperCache` and `VerifySecondarySupers`? > The description is the same and only default values are different. Sorry, I'll fix it. `UseSecondarySuperCache` refers to the "old" algorithm, and the only purpose of this flag is to allow the effect of disabling the secondary supers cache to be measured. `VerifySecondarySupers` allows test suites to be run in a mode that uses both the new hashed lookup and an exhaustive search in order to thest hashed lookups. It's very useful for testing ports. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18309#discussion_r1536029900 From aph at openjdk.org Fri Mar 22 18:45:29 2024 From: aph at openjdk.org (Andrew Haley) Date: Fri, 22 Mar 2024 18:45:29 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:36:41 GMT, Andrew Haley wrote: >>> I would have liked to save the hashed interfaces list in the CDS archive file, but there are test cases for the Serviceability Agent that assume the interfaces are in the order in which they are declared. I think the tests are probably just wrong about this. For the same reason, I have to make copies of the interfaces array and sort the copies, rather than just using the same array at runtime. >> >> May be file follow up RFE for saving hashed interfaces list in CDS which will also fix those tests. > >> > I would have liked to save the hashed interfaces list in the CDS archive file, but there are test cases for the Serviceability Agent that assume the interfaces are in the order in which they are declared. I think the tests are probably just wrong about this. For the same reason, I have to make copies of the interfaces array and sort the copies, rather than just using the same array at runtime. >> >> May be file follow up RFE for saving hashed interfaces list in CDS which will also fix those tests. > > That sounds good. > @theRealAph I applied your patch and ran SA tests. I did not see the failures you mentioned to me. Is there anything else special I need to do? I need to prepare you a branch with the feature that does the hashing earlier and saves the hashed secondaries in the CDS archive. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2015701761 From matsaave at openjdk.org Fri Mar 22 19:14:21 2024 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 22 Mar 2024 19:14:21 GMT Subject: RFR: 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent In-Reply-To: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> References: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> Message-ID: <-3CJ8f4A-_QJ4ZUkRAeKWJVebGaHlSIyjQRwW-5rszE=.b8b13c47-8d60-4ad5-9967-84b8616506b8@github.com> On Tue, 12 Mar 2024 22:10:16 GMT, Calvin Cheung wrote: > The bug only reproduces with dynamic dump. > A fix is to call `assign_class_loader_type()` in `KlassFactory::check_shared_class_file_load_hook()` instead of in `ArchiveBuilder::make_klasses_shareable()`. > > Passed tiers 1 - 4 testing. LGTM ------------- Marked as reviewed by matsaave (Committer). PR Review: https://git.openjdk.org/jdk/pull/18254#pullrequestreview-1955592279 From ccheung at openjdk.org Fri Mar 22 20:18:30 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Fri, 22 Mar 2024 20:18:30 GMT Subject: Integrated: 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent In-Reply-To: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> References: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> Message-ID: On Tue, 12 Mar 2024 22:10:16 GMT, Calvin Cheung wrote: > The bug only reproduces with dynamic dump. > A fix is to call `assign_class_loader_type()` in `KlassFactory::check_shared_class_file_load_hook()` instead of in `ArchiveBuilder::make_klasses_shareable()`. > > Passed tiers 1 - 4 testing. This pull request has now been integrated. Changeset: f33a8445 Author: Calvin Cheung URL: https://git.openjdk.org/jdk/commit/f33a8445ebfc3089e91688383db26b91e582658c Stats: 68 lines in 5 files changed: 60 ins; 4 del; 4 mod 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent Reviewed-by: iklam, matsaave ------------- PR: https://git.openjdk.org/jdk/pull/18254 From ccheung at openjdk.org Fri Mar 22 20:18:29 2024 From: ccheung at openjdk.org (Calvin Cheung) Date: Fri, 22 Mar 2024 20:18:29 GMT Subject: RFR: 8325536: JVM crash during CDS archive creation with -XX:+AllowArchivingWithJavaAgent In-Reply-To: References: <7mQTU1Oxceh3TP1RJIJCiAlA-Ba6_vsVzcnQJeyAcZU=.8f94f4d2-6bb9-498d-a806-ee4db370477f@github.com> Message-ID: On Wed, 20 Mar 2024 18:10:06 GMT, Ioi Lam wrote: >> The bug only reproduces with dynamic dump. >> A fix is to call `assign_class_loader_type()` in `KlassFactory::check_shared_class_file_load_hook()` instead of in `ArchiveBuilder::make_klasses_shareable()`. >> >> Passed tiers 1 - 4 testing. > > LGTM. Thanks @iklam, @matias9927 for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18254#issuecomment-2015836271 From kvn at openjdk.org Fri Mar 22 21:34:35 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 22 Mar 2024 21:34:35 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v10] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:19:39 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix AArch64 build & improve comment about InstructionMark Good. `StringRepeat.java` GHA failure on linux-x86 is due to `JDK-8328524`. I will submit our testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2015958749 From lucy at openjdk.org Fri Mar 22 22:18:28 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 22 Mar 2024 22:18:28 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission In-Reply-To: References: Message-ID: <5xA2kMNs7192Stj4n4ZJh4WfnT2gAoVpHbNhw6pdszM=.54bd1982-f4e0-4935-b7b8-2b0dab2a4a42@github.com> On Sat, 16 Dec 2023 05:03:20 GMT, Amit Kumar wrote: >> `s390x` also run into assert failure: `assert(masm->inst_mark() == nullptr) failed: should be.` >> >> >> V [libjvm.so+0xfb0938] PhaseOutput::fill_buffer(C2_MacroAssembler*, unsigned int*)+0x2370 (output.cpp:1812) >> V [libjvm.so+0xfb21ce] PhaseOutput::Output()+0xcae (output.cpp:362) >> V [libjvm.so+0x6a90a8] Compile::Code_Gen()+0x460 (compile.cpp:2989) >> V [libjvm.so+0x6ad848] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1738 (compile.cpp:887) >> V [libjvm.so+0x4fb932] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x14a (c2compiler.cpp:119) >> V [libjvm.so+0x6b81a2] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xd9a (compileBroker.cpp:2282) >> V [libjvm.so+0x6b8eaa] CompileBroker::compiler_thread_loop()+0x5a2 (compileBroker.cpp:1943) > >>@offamitkumar, @TheRealMDoerr - can you please re-run the tests on the platforms convenient for you? > > I run build for fastdebug & release VMs and tier1 test for fastdebug VM. Everything seems good. Sorry, became aware of this only now. I will try to set aside some spare time. @offamitkumar can you please run some tests on the PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2016011326 From sgibbons at openjdk.org Sat Mar 23 02:16:57 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 23 Mar 2024 02:16:57 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v14] In-Reply-To: References: Message-ID: > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 46 commits: - Merge branch 'openjdk:master' into indexof - Cleaned up, ready for review - Pre-cleanup code - Add JMH. Add 16-byte compares to arrays_equals - Better method for mask creation - Merge branch 'openjdk:master' into indexof - Most cleanup done. - Remove header dependency - Works - needs cleanup - Passes tests. - ... and 36 more: https://git.openjdk.org/jdk/compare/bc739639...e079fc12 ------------- Changes: https://git.openjdk.org/jdk/pull/16753/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=13 Stats: 4905 lines in 19 files changed: 4551 ins; 241 del; 113 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From sgibbons at openjdk.org Sat Mar 23 02:16:58 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 23 Mar 2024 02:16:58 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v13] In-Reply-To: References: Message-ID: On Thu, 22 Feb 2024 03:15:10 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Addressed some review coments; replaced hard-coded registers with descriptive names. Re-opening this PR after some insightful comments, which resulted in a re-design. Now ready for review. Now showing ~1.6x performance gain on original StringIndexOf benchmark. I added a benchmark (StringIndexOfHuge) that measures performance on large-ish strings (e.g., 34-byte substring within a 2052-byte string). This benchmark performed on average ~14x original. Sorry for the large change, but it couldn't be done piecemeal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16753#issuecomment-2016308399 From duke at openjdk.org Sat Mar 23 10:01:32 2024 From: duke at openjdk.org (Francesco Nigro) Date: Sat, 23 Mar 2024 10:01:32 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v14] In-Reply-To: References: Message-ID: On Sat, 23 Mar 2024 02:16:57 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 46 commits: > > - Merge branch 'openjdk:master' into indexof > - Cleaned up, ready for review > - Pre-cleanup code > - Add JMH. Add 16-byte compares to arrays_equals > - Better method for mask creation > - Merge branch 'openjdk:master' into indexof > - Most cleanup done. > - Remove header dependency > - Works - needs cleanup > - Passes tests. > - ... and 36 more: https://git.openjdk.org/jdk/compare/bc739639...e079fc12 Hi, in Netty, we have our own AsciiString::indexOf based on SWAR techniques, which is manually loop unrolling the head processing (first < 8 bytes) to artificially make sure the branch predictor got different branches to care AND JIT won't make it wrong. We have measured (I can provide a link of the benchmark and results, If you are interested) that it delivers a much better performance on tiny strings and makes a smoother degradation vs perfectly aligned string length as well. Clearly this tends to be much visible if the input strings have shuffled delimiter positions, to make the branch prediction misses more relevant. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16753#issuecomment-2016432345 From lmesnik at openjdk.org Sat Mar 23 15:35:20 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sat, 23 Mar 2024 15:35:20 GMT Subject: RFR: 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 02:44:38 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI `GetCurrentContendedMonitor` function on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18444#pullrequestreview-1956396534 From kvn at openjdk.org Sat Mar 23 16:15:32 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 23 Mar 2024 16:15:32 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v10] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:19:39 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix AArch64 build & improve comment about InstructionMark My tier1-3,xcomp,stress testing passed. We test on x64 and aarch64 only. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16484#pullrequestreview-1956401778 From sgibbons at openjdk.org Sat Mar 23 16:50:28 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sat, 23 Mar 2024 16:50:28 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v14] In-Reply-To: References: Message-ID: On Sat, 23 Mar 2024 09:57:49 GMT, Francesco Nigro wrote: >> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 46 commits: >> >> - Merge branch 'openjdk:master' into indexof >> - Cleaned up, ready for review >> - Pre-cleanup code >> - Add JMH. Add 16-byte compares to arrays_equals >> - Better method for mask creation >> - Merge branch 'openjdk:master' into indexof >> - Most cleanup done. >> - Remove header dependency >> - Works - needs cleanup >> - Passes tests. >> - ... and 36 more: https://git.openjdk.org/jdk/compare/bc739639...e079fc12 > > Hi, in Netty, we have our own AsciiString::indexOf based on SWAR techniques, which is manually loop unrolling the head processing (first < 8 bytes) to artificially make sure the branch predictor got different branches to care AND JIT won't make it wrong. We have measured (I can provide a link of the benchmark and results, If you are interested) that it delivers a much better performance on tiny strings and makes a smoother degradation vs perfectly aligned string length as well. Clearly this tends to be much visible if the input strings have shuffled delimiter positions, to make the branch prediction misses more relevant. Hi @franz1981. I'd be interested in seeing the code. I looked [here](https://github.com/netty/netty/blob/3a3f9d13b129555802de5652667ca0af662f554e/common/src/main/java/io/netty/util/AsciiString.java#L696), but that just looks like a na?ve implementation, so I must be missing something. This code uses vector compares to search for the first byte of the substring (needle) in 32-byte chunks of the input string (haystack). However, once a character is found, it also checks for a match corresponding to the last byte of the needle within the haystack before doing the full needle comparison. This is also in 32-byte chunks. That is, we load a vector register with 32 (or 16 for wide chars) copies of the first byte of the needle, and another with copies of the last byte of the needle. The first comparison is done at the start of the haystack, giving us indication of the presence and index of the first byte. We then compare the last byte of the needle at the haystack indexed at needle length - 1 (i.e., the last byte). This tells us if the last byte of the needle appears in the correct relative position within the haystack. ANDing these results tells us whether or not we have a candidate needle within the haystack, as well as the position of the needle. Only then do we do a full char-by-char comparison of the needle to the haystack (this is also done with vector instructions when possible). A lot of this code is there to handle the cases of small-ish needles within small-ish haystacks, where 32-byte reads are not possible (due to reading past the end of the strings, possibly generating a page fault). I handle less than 32-byte haystacks by copying the haystack to the stack, where I can be assured that 32-byte reads will be possible. So there are special cases for haystack < 32 bytes with needle sizes <= 10 (arbitrary) and one for haystacks > 32 bytes and needle size <= 10. I also added a section for haystacks <= 32 bytes and needle sizes < 5, which seem to be the most common cases. This path copies the haystack to the stack (a single vector read & write) and up to 5 vector comparisons, one for each byte of the needle, with no branching or looping. I'd be very interested in seeing the Netty SWAR implementation. Thanks for the comment. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16753#issuecomment-2016544706 From duke at openjdk.org Sat Mar 23 19:02:28 2024 From: duke at openjdk.org (Francesco Nigro) Date: Sat, 23 Mar 2024 19:02:28 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v14] In-Reply-To: References: Message-ID: On Sat, 23 Mar 2024 02:16:57 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 46 commits: > > - Merge branch 'openjdk:master' into indexof > - Cleaned up, ready for review > - Pre-cleanup code > - Add JMH. Add 16-byte compares to arrays_equals > - Better method for mask creation > - Merge branch 'openjdk:master' into indexof > - Most cleanup done. > - Remove header dependency > - Works - needs cleanup > - Passes tests. > - ... and 36 more: https://git.openjdk.org/jdk/compare/bc739639...e079fc12 Sure thing: https://github.com/netty/netty/pull/13534#issuecomment-1685247165 It's the comparison with String::indexOf while the impl is: https://github.com/netty/netty/blob/3a3f9d13b129555802de5652667ca0af662f554e/buffer/src/main/java/io/netty/buffer/ByteBufUtil.java#L590 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16753#issuecomment-2016576139 From jbhateja at openjdk.org Sun Mar 24 10:04:48 2024 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 24 Mar 2024 10:04:48 GMT Subject: RFR: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 Message-ID: This bug fix patch tightens the predication check for small constant length clear array pattern and relaxes associated feature checks. Modified few comments for clarity. Kindly review and approve. Best Regards, Jatin ------------- Commit messages: - Adding Testpoint - Some comments modifications - 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 Changes: https://git.openjdk.org/jdk/pull/18464/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18464&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328181 Stats: 21 lines in 5 files changed: 6 ins; 0 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/18464.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18464/head:pull/18464 PR: https://git.openjdk.org/jdk/pull/18464 From kbarrett at openjdk.org Mon Mar 25 00:27:44 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 25 Mar 2024 00:27:44 GMT Subject: RFR: 8328862: Remove unused GrowableArrayFilterIterator Message-ID: Please review this trivial removal of the unused GrowableArrayFilterIterator class template. ------------- Commit messages: - remove unused GrowableArrayFilterIterator Changes: https://git.openjdk.org/jdk/pull/18466/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18466&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328862 Stats: 53 lines in 1 file changed: 0 ins; 52 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18466.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18466/head:pull/18466 PR: https://git.openjdk.org/jdk/pull/18466 From dholmes at openjdk.org Mon Mar 25 02:03:21 2024 From: dholmes at openjdk.org (David Holmes) Date: Mon, 25 Mar 2024 02:03:21 GMT Subject: RFR: 8328862: Remove unused GrowableArrayFilterIterator In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 00:22:45 GMT, Kim Barrett wrote: > Please review this trivial removal of the unused GrowableArrayFilterIterator > class template. Good and trivial. I added some historical context to the JBS issue. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18466#pullrequestreview-1956760608 From fyang at openjdk.org Mon Mar 25 02:09:25 2024 From: fyang at openjdk.org (Fei Yang) Date: Mon, 25 Mar 2024 02:09:25 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission In-Reply-To: References: Message-ID: On Fri, 3 Nov 2023 04:44:40 GMT, Fei Yang wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Hello, I guess you might want to merge latest jdk master and add more changes. > I witnessed some build errors when building the latest jdk master with this patch on linux-riscv64: > > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad: In member function 'virtual void UdivINode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const': > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad:2412:30: error: 'cbuf' was not declared in this scope > 2412 | C2_MacroAssembler _masm(&cbuf); > | ^~~~ > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad: In member function 'virtual void UdivLNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const': > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad:2427:30: error: 'cbuf' was not declared in this scope > 2427 | C2_MacroAssembler _masm(&cbuf); > | ^~~~ > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad: In member function 'virtual void UmodINode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const': > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad:2442:30: error: 'cbuf' was not declared in this scope > 2442 | C2_MacroAssembler _masm(&cbuf); > | ^~~~ > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad: In member function 'virtual void UmodLNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const': > /home/fyang/jdk/src/hotspot/cpu/riscv/riscv.ad:2457:30: error: 'cbuf' was not declared in this scope > 2457 | C2_MacroAssembler _masm(&cbuf); > @RealFYang - Thanks for the note. I'll do that and update the PR. Thanks for the update. The RISC-V part looks fine to me. I did release release build and ran tier1-3 tests on linux-riscv64 platform. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2017083424 From amitkumar at openjdk.org Mon Mar 25 05:32:27 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 25 Mar 2024 05:32:27 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v5] In-Reply-To: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> References: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> Message-ID: On Wed, 20 Mar 2024 16:32:50 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Address code cleanup review comments I have gone through instruction checking with PofZ book. Only few cosmetic changes are required. I couldn't found anything else. It would be better @RealLucy could take a look at it, at least "s390.ad" file. @sid8606 please update copyright header as well. In multiple files, header is not updated. src/hotspot/cpu/s390/assembler_s390.hpp line 1300: > 1298: #define VFD_ZOPC (unsigned long)(0xe7L << 40 | 0xE5L << 0) // V1 := V2 / V3, element size = 2**m > 1299: #define VFSQ_ZOPC (unsigned long)(0xe7L << 40 | 0xCEL << 0) // V1 := sqrt of V2, element size = 2**m > 1300: #define VFLR_ZOPC (unsigned long)(0xe7L << 40 | 0xC5L << 0) // V1 := sqrt of V2, element size = 2**m This one is not for square root. Please fix the comment. src/hotspot/cpu/s390/assembler_s390.hpp line 2498: > 2496: inline void z_vlef( VectorRegister v1, int64_t d2, Register x2, Register b2, int64_t m3); > 2497: inline void z_vleg( VectorRegister v1, int64_t d2, Register x2, Register b2, int64_t m3); > 2498: inline void z_vl(VectorRegister v1, const Address& a); Suggestion: inline void z_vl( VectorRegister v1, const Address& a); src/hotspot/cpu/s390/assembler_s390.hpp line 2628: > 2626: inline void z_vsteg( VectorRegister v1, int64_t d2, Register x2, Register b2, int64_t m3); > 2627: inline void z_vstl( VectorRegister v1, Register r3, int64_t d2, Register b2); > 2628: inline void z_vst(VectorRegister v1, const Address& a); Suggestion: inline void z_vst( VectorRegister v1, const Address& a); src/hotspot/cpu/s390/assembler_s390.hpp line 2687: > 2685: inline void z_vmlb( VectorRegister v1, VectorRegister v2, VectorRegister v3); > 2686: inline void z_vmlhw( VectorRegister v1, VectorRegister v2, VectorRegister v3); > 2687: inline void z_vmlf( VectorRegister v1, VectorRegister v2, VectorRegister v3); Suggestion: inline void z_vmlb( VectorRegister v1, VectorRegister v2, VectorRegister v3); inline void z_vmlhw( VectorRegister v1, VectorRegister v2, VectorRegister v3); inline void z_vmlf( VectorRegister v1, VectorRegister v2, VectorRegister v3); src/hotspot/cpu/s390/assembler_s390.hpp line 2824: > 2822: inline void z_vpopcth( VectorRegister v1, VectorRegister v2); > 2823: inline void z_vpopctf( VectorRegister v1, VectorRegister v2); > 2824: inline void z_vpopctg( VectorRegister v1, VectorRegister v2); Suggestion: inline void z_vpopctb(VectorRegister v1, VectorRegister v2); inline void z_vpopcth(VectorRegister v1, VectorRegister v2); inline void z_vpopctf(VectorRegister v1, VectorRegister v2); inline void z_vpopctg(VectorRegister v1, VectorRegister v2); src/hotspot/cpu/s390/assembler_s390.hpp line 2916: > 2914: // ========================== > 2915: // Add > 2916: inline void z_vfa(VectorRegister v1, VectorRegister v2, VectorRegister v3, int64_t m4); Suggestion: inline void z_vfa( VectorRegister v1, VectorRegister v2, VectorRegister v3, int64_t m4); src/hotspot/cpu/s390/assembler_s390.hpp line 2921: > 2919: > 2920: //SUB > 2921: inline void z_vfs(VectorRegister v1, VectorRegister v2, VectorRegister v3, int64_t m4); Suggestion: inline void z_vfs( VectorRegister v1, VectorRegister v2, VectorRegister v3, int64_t m4); src/hotspot/cpu/s390/assembler_s390.hpp line 2931: > 2929: > 2930: //DIV > 2931: inline void z_vfd(VectorRegister v1, VectorRegister v2, VectorRegister v3, int64_t m4); Suggestion: inline void z_vfd( VectorRegister v1, VectorRegister v2, VectorRegister v3, int64_t m4); src/hotspot/cpu/s390/assembler_s390.hpp line 2936: > 2934: > 2935: //square root > 2936: inline void z_vfsq(VectorRegister v1, VectorRegister v2, int64_t m3); Suggestion: inline void z_vfsq( VectorRegister v1, VectorRegister v2, int64_t m3); src/hotspot/cpu/s390/assembler_s390.hpp line 2941: > 2939: > 2940: //vector fp load rounded > 2941: inline void z_vflr( VectorRegister v1, VectorRegister v2, int64_t m3, int64_t m5); Suggestion: inline void z_vflr( VectorRegister v1, VectorRegister v2, int64_t m3, int64_t m5); src/hotspot/cpu/s390/assembler_s390.hpp line 2944: > 2942: inline void z_vflrd( VectorRegister v1, VectorRegister v2, int64_t m5); > 2943: > 2944: // Vector Floatingpoint instructions No, these instructions seems simple floating point instructions, not vector floating point. src/hotspot/cpu/s390/assembler_s390.inline.hpp line 1207: > 1205: inline void Assembler::z_vfssb( VectorRegister v1, VectorRegister v2, VectorRegister v3) {z_vfs(v1, v2, v3, VRET_FW); } // vector element type 'F' > 1206: inline void Assembler::z_vfsdb( VectorRegister v1, VectorRegister v2, VectorRegister v3) {z_vfs(v1, v2, v3, VRET_DW); } // vector element type 'G' > 1207: // Suggestion: src/hotspot/cpu/s390/assembler_s390.inline.hpp line 1223: > 1221: inline void Assembler::z_vfsqsb( VectorRegister v1, VectorRegister v2) {z_vfsq(v1, v2, VRET_FW); } > 1222: inline void Assembler::z_vfsqdb( VectorRegister v1, VectorRegister v2) {z_vfsq(v1, v2, VRET_DW); } > 1223: maybe add commend regarding Rounding Instructions ? src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 264: > 262: > 263: static const RegisterSaver::LiveRegType RegisterSaver_LiveVRegs[] = { > 264: // live vector registers (optional, only these ones are used by C2): Suggestion: // live vector registers (optional, only these are used by C2): src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 296: > 294: } > 295: > 296: Suggestion: src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 422: > 420: map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), RegisterSaver_LiveVRegs[i].vmreg); > 421: offset += v_reg_size; > 422: } just in case you like this one. Suggestion: for (int i = 0; i < vregstosave_num; i++, offset += v_reg_size) { int reg_num = RegisterSaver_LiveVRegs[i].reg_num; __ z_vst(as_VectorRegister(reg_num), Address(Z_SP, offset)); map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), RegisterSaver_LiveVRegs[i].vmreg); } src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 576: > 574: > 575: offset += v_reg_size; > 576: } push if you like ;-) Suggestion: for (int i = 0; i < vregstosave_num; i++, offset += v_reg_size) { int reg_num = RegisterSaver_LiveVRegs[i].reg_num; __ z_vl(as_VectorRegister(reg_num), Address(Z_SP, offset)); } ------------- Changes requested by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/18162#pullrequestreview-1953948697 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537026170 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537020685 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537021563 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537022342 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537023129 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537025417 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537025529 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537025651 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537025795 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537025868 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537024327 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537034274 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537040800 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1535054098 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1535056022 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1535068418 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1535071035 From duke at openjdk.org Mon Mar 25 06:58:36 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 25 Mar 2024 06:58:36 GMT Subject: RFR: 8328876: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier Message-ID: The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: 1 It show regression in some platform, like Apple silicon in mac os 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" It can be fixed by: 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) 2 Check the special pattern and merge the subsequent dmb. It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. ------------- Commit messages: - 8328876: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier Changes: https://git.openjdk.org/jdk/pull/18467/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18467&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328876 Stats: 343 lines in 9 files changed: 330 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/18467.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18467/head:pull/18467 PR: https://git.openjdk.org/jdk/pull/18467 From kbarrett at openjdk.org Mon Mar 25 07:10:24 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 25 Mar 2024 07:10:24 GMT Subject: Integrated: 8328862: Remove unused GrowableArrayFilterIterator In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 00:22:45 GMT, Kim Barrett wrote: > Please review this trivial removal of the unused GrowableArrayFilterIterator > class template. This pull request has now been integrated. Changeset: acc4a828 Author: Kim Barrett URL: https://git.openjdk.org/jdk/commit/acc4a828184bd2838a5c0b572f404aa1edaf59e2 Stats: 53 lines in 1 file changed: 0 ins; 52 del; 1 mod 8328862: Remove unused GrowableArrayFilterIterator Reviewed-by: dholmes ------------- PR: https://git.openjdk.org/jdk/pull/18466 From kbarrett at openjdk.org Mon Mar 25 07:10:24 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 25 Mar 2024 07:10:24 GMT Subject: RFR: 8328862: Remove unused GrowableArrayFilterIterator In-Reply-To: References: Message-ID: <9OWoQ-ip7dWpAgve6FwgeQf8fg6LGZPBJeq2U5Vb23w=.349f822e-ebbe-4b3a-996c-b2114f982ac2@github.com> On Mon, 25 Mar 2024 02:00:37 GMT, David Holmes wrote: >> Please review this trivial removal of the unused GrowableArrayFilterIterator >> class template. > > Good and trivial. > > I added some historical context to the JBS issue. > > Thanks Thanks for the review and for looking up the history @dholmes-ora . I'd assumed it was used and later refactoring moved away from it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18466#issuecomment-2017355066 From aph at openjdk.org Mon Mar 25 09:00:22 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 25 Mar 2024 09:00:22 GMT Subject: RFR: 8328878: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 06:54:01 GMT, kuaiwei wrote: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. Are there any possible sequences of `dmb ld; dmb st` that will not be minimized? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2017503261 From duke at openjdk.org Mon Mar 25 09:03:27 2024 From: duke at openjdk.org (kuaiwei) Date: Mon, 25 Mar 2024 09:03:27 GMT Subject: RFR: 8328878: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 08:57:46 GMT, Andrew Haley wrote: > Are there any possible sequences of `dmb ld; dmb st` that will not be minimized? The worst case is "dmb ld; dmb st; dmb ish", that will be merged as "dmb ld; dmb ish". ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2017508670 From ihse at openjdk.org Mon Mar 25 09:20:23 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 25 Mar 2024 09:20:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 22 Mar 2024 15:22:57 GMT, Hamlin Li wrote: > For ACLE, it's supported in [10.1](https://gcc.gnu.org/gcc-10/changes.html) Do this PR require ACLE? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2017530218 From ihse at openjdk.org Mon Mar 25 09:20:23 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 25 Mar 2024 09:20:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 22 Mar 2024 15:27:07 GMT, Hamlin Li wrote: > A question about the licensing: how long does it take to finish the legal process of the sleef licence? I am not sure this is a relevant question anymore. As I said, the libsleef build system seems virtually impossible to integrate into the OpenJDK build, so I no longer believe this is a way forward. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2017533403 From ihse at openjdk.org Mon Mar 25 09:20:25 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 25 Mar 2024 09:20:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8545: > 8543: char ebuf[1024]; > 8544: char dll_name[JVM_MAXPATHLEN]; > 8545: if (os::dll_locate_lib(dll_name, sizeof(dll_name), Arguments::get_dll_dir(), "vectormath")) { 1. You load the library here. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8569: > 8567: > 8568: // Math vector stubs implemented with SVE for scalable vector size. > 8569: if (UseSVE > 0) { 2. You check UseSVE here. If the library could not be loaded, you would not even reach here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1537259422 PR Review Comment: https://git.openjdk.org/jdk/pull/18294#discussion_r1537260140 From ihse at openjdk.org Mon Mar 25 09:27:26 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 25 Mar 2024 09:27:26 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <6hpEaT7qf_Mo8eyeDiu6_UvBy0J4TBkdB2aETVt0Axw=.4e2dbd36-d14a-4e53-b6ea-88e60594a139@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> <9H3PBcqWAZevaB5KU1_Lh5ptj1f365-iLiIyaBMnNyE=.ce6f80fb-4fe8-4cf4-b58e-456238637b1f@github.com> <6hpEaT7qf_Mo8eyeDiu6_UvBy0J4TBkdB2aETVt0Axw=.4e2dbd36-d14a-4e53-b6ea-88e60594a139@github.com> Message-ID: On Fri, 22 Mar 2024 15:30:35 GMT, Hamlin Li wrote: > > But that raises an interesting question. What happens if you try to load a library compiled with `-march=armv8-a+sve` on a non-SVE system? Is the ELF flagged to require SVE so it will fail to load? I'm hoping this is the case -- if so, everything will work as it should in this PR, but I honestly don't know. (After spending like 10 years working with building, I still discover things I need to learn...). > > I think we can handle it, when a jdk built with sve support runs on a non-sve machine, the sve related code will not be executed with the protection of UseSVE vm flag which is detected at runtime startup. You misunderstand me; perhaps I'm not clear enough here. First of all, my question was of a more general nature. Is there such a mechanism in the dynamic linker that protects us from loading libraries that will fail if an ISA extension is used that is missing on the running system? Or do the linker just check that the ELF is for the right CPU? Secondly, I assume that libsleef.so proper needs to be compiled with SVE support as well. So if we were to skip the shim vectormath library and load libsleef directly from hotspot, what would happen then? Thirdly, the check with UseSVE happens *after* you load the library. If there is a DL verification, you will not even reach this check, but get a DLL loading failure before that. Sure, regardless of which happens, you will not execute bad code, but I'd like to know which is the case. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2017553490 From jsjolen at openjdk.org Mon Mar 25 10:13:23 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 25 Mar 2024 10:13:23 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <0G_oRg-MB6aRKXpHJ4ca8lIQ72ZhsA2WBujtJ8BQaD0=.bbbc53c2-cb49-4051-998e-e9e48e4ea516@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> <0G_oRg-MB6aRKXpHJ4ca8lIQ72ZhsA2WBujtJ8BQaD0=.bbbc53c2-cb49-4051-998e-e9e48e4ea516@github.com> Message-ID: On Fri, 22 Mar 2024 16:26:00 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/nmtTreap.hpp line 45: >> >>> 43: // the tree which is O(log n) so we are safe from stack overflow. >>> 44: >>> 45: // TreapNode has LEQ nodes on the left, GT nodes on the right. >> >> Simplification: why do we need a separate template parameter for the comparison? Why not just use K's operators < > == ? > > I tend to not use operator-overloading so I didn't think of that as a possibility. One annoying part about such a design is that you can't use pointers to values as keys, they must now be wrapped within their own type (like `StackIndex` does). Yes, there's a bit of a smell of YAGNI here, but at least `std::set` agrees with me on `Compare` being a template argument. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1537345774 From sjayagond at openjdk.org Mon Mar 25 11:08:25 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 25 Mar 2024 11:08:25 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v5] In-Reply-To: References: <9csEuN4OqSaYAoZir8OCvudRKhirezZ9qxNlgXs7tyU=.83707471-1d5e-4a1a-aaf9-cb62169e9c2d@github.com> Message-ID: On Fri, 22 Mar 2024 05:23:36 GMT, Amit Kumar wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Address code cleanup review comments > > src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 422: > >> 420: map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), RegisterSaver_LiveVRegs[i].vmreg); >> 421: offset += v_reg_size; >> 422: } > > just in case you like this one. > Suggestion: > > for (int i = 0; i < vregstosave_num; i++, offset += v_reg_size) { > int reg_num = RegisterSaver_LiveVRegs[i].reg_num; > > __ z_vst(as_VectorRegister(reg_num), Address(Z_SP, offset)); > > map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), RegisterSaver_LiveVRegs[i].vmreg); > } This seems more readable code, Fixing it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537411943 From jsjolen at openjdk.org Mon Mar 25 11:10:24 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 25 Mar 2024 11:10:24 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> Message-ID: On Wed, 20 Mar 2024 17:06:06 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Include os.inline.hpp Or maybe ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2017747938 From ksakata at openjdk.org Mon Mar 25 12:30:48 2024 From: ksakata at openjdk.org (Koichi Sakata) Date: Mon, 25 Mar 2024 12:30:48 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed Message-ID: This pull request removes an unused class. I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. ------------- Commit messages: - Remove JVMTI VirtualThreadGetThreadStateClosure class Changes: https://git.openjdk.org/jdk/pull/18341/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18341&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8325186 Stats: 42 lines in 2 files changed: 0 ins; 42 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18341.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18341/head:pull/18341 PR: https://git.openjdk.org/jdk/pull/18341 From sjayagond at openjdk.org Mon Mar 25 12:39:35 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Mon, 25 Mar 2024 12:39:35 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v6] In-Reply-To: References: Message-ID: > This PR Adds SIMD support on s390x. Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: Fix cosmetic review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18162/files - new: https://git.openjdk.org/jdk/pull/18162/files/25c06768..ad91ab30 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=04-05 Stats: 41 lines in 3 files changed: 3 ins; 8 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From stuefe at openjdk.org Mon Mar 25 13:22:23 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 25 Mar 2024 13:22:23 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: On Fri, 22 Mar 2024 16:29:30 GMT, Johan Sj?len wrote: >> src/hotspot/share/nmt/vmatree.hpp line 63: >> >>> 61: } >>> 62: }; >>> 63: >> >> In my original proposal, a `MappingState` (https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L106C7-L114) is the combination of ( MEMFLAGS + VMAstate ). >> >> A `MappingStateChange` https://github.com/tstuefe/jdk/blob/6be830cd2e90a009effb016fbda2e92e1fca8247/src/hotspot/share/nmt/vmaTree.cpp#L38-L65 is the combination of two `MappingState`. Its "is_noop()" method compares the combination of both ( MEMFLAGS + VMAstate ) tuples. >> >> This `State` here is neither. It contains only the incoming VMAState, and carries the full information only for the outgoing state. Its ` is_noop` only compares VMAState. This renders `is_noop()` pretty useless since it alone is not sufficient when checking if areas for mergeability. Consequently, you then tree-search the adjacent node whenever is_noop is used ( e.g. line 161, `if (is_noop(stA) && Metadata::equals(stA.metadata, leqA_n->val().metadata)) { ` . >> >> I would suggest following my proposal here and carrying the full information for both incoming and outgoing states. Advantages would be: >> - it removes the need for tree searches whenever we check whether neighboring areas can be folded. >> - it would also make the code easier to read and arguably more robust >> - it is arguably more logical - semantically, there is no difference between VMAState and the rest of the state information, all of them are part of the total region state that determines if regions are mergeable. >> >> If you are worried about data size: all the information we need can still be encoded in a 64-bit value, as suggested above. And compared with a single cmpq. 16 bits for MEMFLAGS, 2 bits for VMAState, and 32 bits are more than enough as a callstack id (in fact, 16 bits should be enough). > > Hi, just to be clear: We don't perform a tree search whenever `is_noop` is used, the `leqA_n` is also present in your draft. Still, I think you might be right that having the full state for both in and out is the best way to go, I'll try it out as a simplification. Ah, you are of course right. I remember trying your approach first too: only keeping one state per node (also the outgoing state). Because, like you, I thought that the redundant data are unnecessary. But I ran into problem with rare corner cases, so I switched to the in-out-model. Unfortunately I blank out when trying to remember what the issue actually was. Figures :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1537588056 From sgibbons at openjdk.org Mon Mar 25 14:39:59 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 25 Mar 2024 14:39:59 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v15] In-Reply-To: References: Message-ID: > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Remove infinite loop (used for debugging) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16753/files - new: https://git.openjdk.org/jdk/pull/16753/files/e079fc12..1cd1b501 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=13-14 Stats: 12 lines in 1 file changed: 0 ins; 2 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From aph at openjdk.org Mon Mar 25 15:10:25 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 25 Mar 2024 15:10:25 GMT Subject: RFR: 8328878: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 09:01:03 GMT, kuaiwei wrote: > > Are there any possible sequences of `dmb ld; dmb st` that will not be minimized? > > The worst case is "dmb ld; dmb st; dmb ish", that will be merged as "dmb ld; dmb ish". That's not too bad, I suppose, but to me it feels a little like unfinished business. If I were you I'd accumulate `DMB`s as 3 bits (ld, st, and sy) and emit the Right Thing on the first non-DMB instruction. There's the interesting possibility that a DMB might be emitted as the last instruction before the end of the lifetime of an assembler. In that case I guess I'd do some experiments to find out what worked. Also consider a queue? We already accumulate bits of an instruction in `Instruction_aarch64::bits`. Maybe that could be elaborated to defer `DMB` instructions? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2018225955 From aph at openjdk.org Mon Mar 25 15:16:25 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 25 Mar 2024 15:16:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes > > > But that raises an interesting question. What happens if you try to load a library compiled with `-march=armv8-a+sve` on a non-SVE system? Is the ELF flagged to require SVE so it will fail to load? I'm hoping this is the case -- if so, everything will work as it should in this PR, but I honestly don't know. (After spending like 10 years working with building, I still discover things I need to learn...). > > > > > > I think we can handle it, when a jdk built with sve support runs on a non-sve machine, the sve related code will not be executed with the protection of UseSVE vm flag which is detected at runtime startup. > > First of all, my question was of a more general nature. Is there such a mechanism in the dynamic linker that protects us from loading libraries that will fail if an ISA extension is used that is missing on the running system? Or do the linker just check that the ELF is for the right CPU? That shouldn't matter to us. If the machine we're running on doesn't support what we need, why should we even try to open it? > Secondly, I assume that libsleef.so proper needs to be compiled with SVE support as well. So if we were to skip the shim vectormath library and load libsleef directly from hotspot, what would happen then? > > Thirdly, the check with UseSVE happens _after_ you load the library. At best that's a waste of time during startup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2018241520 From aph at openjdk.org Mon Mar 25 15:28:21 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 25 Mar 2024 15:28:21 GMT Subject: RFR: 8328878: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 15:08:15 GMT, Andrew Haley wrote: > Also consider a queue? We already accumulate bits of an instruction in `Instruction_aarch64::bits`. Maybe that could be elaborated to defer `DMB` instructions? Or maybe not if that's too complicated. If there's no way to minimize the sequence without a lot of work, I can live with it. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2018272546 From aph at openjdk.org Mon Mar 25 16:41:21 2024 From: aph at openjdk.org (Andrew Haley) Date: Mon, 25 Mar 2024 16:41:21 GMT Subject: RFR: 8328878: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 06:54:01 GMT, kuaiwei wrote: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. Thinking some more, I've had a better idea. Consider a finite state machine. The states would perhaps be {pending_none, pending_ld, pending_st, pending_ldst}. Every `MacroAssembler::emit()`, along with `bind()`, advances the machine, and emits instructions on some state changes. This state machine would be used instead of `code()->last_insn()` ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2018433225 From cjplummer at openjdk.org Mon Mar 25 19:12:22 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 25 Mar 2024 19:12:22 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 08:19:47 GMT, Koichi Sakata wrote: > This pull request removes an unused class. > > I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. Please update copyright dates. ------------- PR Review: https://git.openjdk.org/jdk/pull/18341#pullrequestreview-1958574710 From pchilanomate at openjdk.org Mon Mar 25 19:57:21 2024 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 25 Mar 2024 19:57:21 GMT Subject: RFR: 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 02:44:38 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI `GetCurrentContendedMonitor` function on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Looks good to me. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18444#pullrequestreview-1958659004 From mli at openjdk.org Mon Mar 25 20:15:25 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 25 Mar 2024 20:15:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <1w4y8kdPpONt2iyrqFdjmpDcqwVBoQfiOgK6J5cmFV4=.9cecfe26-6586-4a53-926c-4203b52260d8@github.com> <9H3PBcqWAZevaB5KU1_Lh5ptj1f365-iLiIyaBMnNyE=.ce6f80fb-4fe8-4cf4-b58e-456238637b1f@github.com> <6hpEaT7qf_Mo8eyeDiu6_UvBy0J4TBkdB2aETVt0Axw=.4e2dbd36-d14a-4e53-b6ea-88e60594a139@github.com> Message-ID: On Mon, 25 Mar 2024 09:24:16 GMT, Magnus Ihse Bursie wrote: > > > But that raises an interesting question. What happens if you try to load a library compiled with `-march=armv8-a+sve` on a non-SVE system? Is the ELF flagged to require SVE so it will fail to load? I'm hoping this is the case -- if so, everything will work as it should in this PR, but I honestly don't know. (After spending like 10 years working with building, I still discover things I need to learn...). > > > > > > I think we can handle it, when a jdk built with sve support runs on a non-sve machine, the sve related code will not be executed with the protection of UseSVE vm flag which is detected at runtime startup. > > You misunderstand me; perhaps I'm not clear enough here. > > First of all, my question was of a more general nature. Is there such a mechanism in the dynamic linker that protects us from loading libraries that will fail if an ISA extension is used that is missing on the running system? Or do the linker just check that the ELF is for the right CPU? IMHO, Ithe dymic linker will not check ISA extension. > > Secondly, I assume that libsleef.so proper needs to be compiled with SVE support as well. So if we were to skip the shim vectormath library and load libsleef directly from hotspot, what would happen then? > > Thirdly, the check with UseSVE happens _after_ you load the library. If there is a DL verification, you will not even reach this check, but get a DLL loading failure before that. Sure, regardless of which happens, you will not execute bad code, but I'd like to know which is the case. In that situation, I think UseSVE check will return false to avoid set functions pointer (e.g. StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_SCALABLE][op]) when running on non-SVE matchine. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2018827762 From mli at openjdk.org Mon Mar 25 20:22:28 2024 From: mli at openjdk.org (Hamlin Li) Date: Mon, 25 Mar 2024 20:22:28 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <8x0E0Qfg4PHbsbcqG1_7kBYsTXCSMwg1XpzsVG9uxFw=.ab29508d-add6-4f4b-969e-b873d2748a49@github.com> On Mon, 25 Mar 2024 09:15:21 GMT, Magnus Ihse Bursie wrote: > > A question about the licensing: how long does it take to finish the legal process of the sleef licence? > > I am not sure this is a relevant question anymore. As I said, the libsleef build system seems virtually impossible to integrate into the OpenJDK build, so I no longer believe this is a way forward. It's not necessary to integrate libsleef build sysstem into jdk, we just need to integrate the final sources (generated by sleef cmake) into jdk. I have tested it, it works. We still need legal process support from Oracle. But anyway, it's a furture incremental solution after this pr, am I right? Or are we going to change direction? ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2018839778 From lmesnik at openjdk.org Mon Mar 25 21:00:22 2024 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Mon, 25 Mar 2024 21:00:22 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed In-Reply-To: References: Message-ID: On Mon, 18 Mar 2024 08:19:47 GMT, Koichi Sakata wrote: > This pull request removes an unused class. > > I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18341#pullrequestreview-1958772118 From sspitsyn at openjdk.org Mon Mar 25 23:02:24 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 25 Mar 2024 23:02:24 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 19:09:36 GMT, Chris Plummer wrote: > Please update copyright dates. Need to merge with the main line which has the updated copyright headers. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/18341#issuecomment-2019064922 From sspitsyn at openjdk.org Mon Mar 25 23:07:23 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 25 Mar 2024 23:07:23 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed In-Reply-To: References: Message-ID: <01c5ldfjKFITgpc_ZDDSe1CftyR7znwVlteYgZK9knQ=.0f251395-d2f9-4bf2-a182-dffa5056be64@github.com> On Mon, 18 Mar 2024 08:19:47 GMT, Koichi Sakata wrote: > This pull request removes an unused class. > > I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. Looks good. Thank you for taking care about it! ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18341#pullrequestreview-1958946924 From sspitsyn at openjdk.org Mon Mar 25 23:35:23 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 25 Mar 2024 23:35:23 GMT Subject: RFR: 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 02:44:38 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI `GetCurrentContendedMonitor` function on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 Leonid and Patricio, thank you for review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18444#issuecomment-2019103923 From kbarrett at openjdk.org Tue Mar 26 00:00:31 2024 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 26 Mar 2024 00:00:31 GMT Subject: RFR: 8328997: Remove unnecessary template parameter lists in GrowableArray Message-ID: Please review this change to the GrowableArray code to remove unnecessary template parameter lists. They aren't needed, and some may become syntactically invalid in the future. Testing: mach5 tier1. ------------- Commit messages: - remove unneeded specializations Changes: https://git.openjdk.org/jdk/pull/18480/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18480&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328997 Stats: 16 lines in 1 file changed: 0 ins; 0 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/18480.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18480/head:pull/18480 PR: https://git.openjdk.org/jdk/pull/18480 From duke at openjdk.org Tue Mar 26 02:15:21 2024 From: duke at openjdk.org (kuaiwei) Date: Tue, 26 Mar 2024 02:15:21 GMT Subject: RFR: 8328878: Rework [AArch64] Use "dmb.ishst + dmb.ishld" for release barrier In-Reply-To: References: Message-ID: <1CtRC3gqDiwRIuv5zMZNoB9Z1KfRNIKitfHV2lzNzKQ=.0ded6bcb-a9cc-455c-a1c6-e29c022bcda0@github.com> On Mon, 25 Mar 2024 06:54:01 GMT, kuaiwei wrote: > The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues: > 1 It show regression in some platform, like Apple silicon in mac os > 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld" > > It can be fixed by: > 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2) > 2 Check the special pattern and merge the subsequent dmb. > > It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these. > > This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern. Hi Andrew, thanks for your suggestions. I will check them and refine the implementation. The difficult part is we need finalize the "dmb" at a non-dmb instruction. It looks Instruction_aarch64 is a good choice because it's embeded in all instructions. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2019259373 From sspitsyn at openjdk.org Tue Mar 26 02:24:25 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 26 Mar 2024 02:24:25 GMT Subject: Integrated: 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 02:44:38 GMT, Serguei Spitsyn wrote: > The internal JVM TI `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes were introduced in the JDK 22 to unify/simplify the JVM TI functions supporting implementation of the virtual threads. This enhancement is to refactor JVM TI `GetCurrentContendedMonitor` function on the base of `JvmtiHandshake` and `JvmtiUnitedHandshakeClosure` classes. > > Testing: > - Ran mach5 tiers 1-6 This pull request has now been integrated. Changeset: 5f7432f7 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/5f7432f7b1ca52941b717fd35a929c91fbdd276d Stats: 57 lines in 3 files changed: 13 ins; 32 del; 12 mod 8328758: GetCurrentContendedMonitor function should use JvmtiHandshake Reviewed-by: lmesnik, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/18444 From amitkumar at openjdk.org Tue Mar 26 02:45:28 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 26 Mar 2024 02:45:28 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v10] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:19:39 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix AArch64 build & improve comment about InstructionMark I have done testing on s390 (z15) with: `{fastdebug, slowdebug, release} X {tier1}` and haven't seen new failure appearing due to this change. Rest of review leave to @RealLucy :-) ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/16484#pullrequestreview-1959188734 From ksakata at openjdk.org Tue Mar 26 04:15:40 2024 From: ksakata at openjdk.org (Koichi Sakata) Date: Tue, 26 Mar 2024 04:15:40 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed [v2] In-Reply-To: References: Message-ID: > This pull request removes an unused class. > > I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. Koichi Sakata has updated the pull request incrementally with one additional commit since the last revision: Update copyright year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18341/files - new: https://git.openjdk.org/jdk/pull/18341/files/09af50a0..b37187ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18341&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18341&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18341.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18341/head:pull/18341 PR: https://git.openjdk.org/jdk/pull/18341 From ksakata at openjdk.org Tue Mar 26 04:20:33 2024 From: ksakata at openjdk.org (Koichi Sakata) Date: Tue, 26 Mar 2024 04:20:33 GMT Subject: RFR: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed [v2] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 04:15:40 GMT, Koichi Sakata wrote: >> This pull request removes an unused class. >> >> I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. > > Koichi Sakata has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright year Thank you for reviewing this pull request. I've updated the copyright year. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18341#issuecomment-2019353342 From ksakata at openjdk.org Tue Mar 26 04:20:34 2024 From: ksakata at openjdk.org (Koichi Sakata) Date: Tue, 26 Mar 2024 04:20:34 GMT Subject: Integrated: 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed In-Reply-To: References: Message-ID: <1vYDqDFfUrDSZEoyaAlFO4WFJlJ7Beb5fGZRIgiaLdg=.239c14b3-9b55-494b-8f08-c7fddab0c475@github.com> On Mon, 18 Mar 2024 08:19:47 GMT, Koichi Sakata wrote: > This pull request removes an unused class. > > I have investigated when this class was no longer in use. In the initial push of the virtual thread code (21972f45a64741ec3002499ef532e363cc3e9f93), there was only the definition of this class. I also checked subsequent commits, and it has never been used since the push. This pull request has now been integrated. Changeset: 5d19d155 Author: Koichi Sakata URL: https://git.openjdk.org/jdk/commit/5d19d15517f08e13dc4b2025bf289091e47fa55b Stats: 42 lines in 2 files changed: 0 ins; 42 del; 0 mod 8325186: JVMTI VirtualThreadGetThreadStateClosure class is no longer used and should be removed Reviewed-by: lmesnik, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/18341 From ihse at openjdk.org Tue Mar 26 08:13:25 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 26 Mar 2024 08:13:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <8x0E0Qfg4PHbsbcqG1_7kBYsTXCSMwg1XpzsVG9uxFw=.ab29508d-add6-4f4b-969e-b873d2748a49@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <8x0E0Qfg4PHbsbcqG1_7kBYsTXCSMwg1XpzsVG9uxFw=.ab29508d-add6-4f4b-969e-b873d2748a49@github.com> Message-ID: On Mon, 25 Mar 2024 20:19:04 GMT, Hamlin Li wrote: > It's not necessary to integrate libsleef build sysstem into jdk, we just need to integrate the final sources (generated by sleef cmake) into jdk. I have tested it, it works. That is an interesting approach. It will raise the bar for updating the sleef sources, but if we document the process carefully I guess that will not be an issue -- all imported third party sources needs some kind of curation when re-importing. I just had a quick look at the sleef sources, and it seems a bit more complicated. Apart from the generated source code, they also compile the same source code file multiple times with different defines on the command line. This is not something we have support for in the build system, but then again it is not nearly as difficult to solve as the code generation part, so it can certainly be handled in some way or another. So yes, I agree that using this approach it might actually be feasible to include the libsleef sources in the JDK repo. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2019779075 From ihse at openjdk.org Tue Mar 26 08:21:26 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 26 Mar 2024 08:21:26 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <8x0E0Qfg4PHbsbcqG1_7kBYsTXCSMwg1XpzsVG9uxFw=.ab29508d-add6-4f4b-969e-b873d2748a49@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> <8x0E0Qfg4PHbsbcqG1_7kBYsTXCSMwg1XpzsVG9uxFw=.ab29508d-add6-4f4b-969e-b873d2748a49@github.com> Message-ID: On Mon, 25 Mar 2024 20:19:04 GMT, Hamlin Li wrote: > But anyway, it's a furture incremental solution after this pr, am I right? Or are we going to change direction? I'm honestly not sure what is the right way forward. It seems we all agree that this PR is not the end solution we want. So the question is, is it worth integrating it while waiting for the correct final solution? Arguments for integrating it: * The functionality can be tested * The source changes gets upstreamed, minimizing risk for future merge conflicts Arguments against integrating it: * In practice, this functionality will be missing from the build and no-one is going to be testing it * It is hard to tell if the functionality is enabled or not * The "glue" code for enabling/disabling this functionality is still not 100% * It might not be worth putting effort into the "glue" code, if this is going to be radically changed later on anyway I hope that was a decent summary of what I and @theRealAph have been saying. If you compare these two lists, it seems like the case for integrating this PR now is rather weak. So maybe it is better to sit on this for a while and await the proper solution? If you disagree, I think you need to explain once more, and more clearly, what you think the gains would be of having this integrated as-is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2019791355 From jsjolen at openjdk.org Tue Mar 26 11:32:16 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 11:32:16 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v6] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - StefanK's suggested style changes - Add Red Hat as contributor to VMATree ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/1928b9da..466ddc6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=04-05 Stats: 21 lines in 4 files changed: 12 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 11:32:17 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 11:32:17 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v2] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 15:46:32 GMT, Stefan Karlsson wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include the mutex header > > Nice! I went over the ZGC files and requested a few style updates. @stefank, thank you for the style change suggestions. They've now been applied. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2020175830 From jsjolen at openjdk.org Tue Mar 26 11:35:22 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 11:35:22 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: On Fri, 22 Mar 2024 13:18:38 GMT, Thomas Stuefe wrote: >Also, does the TreapCHeap friend not need to be declared as template? `clangd` complains about the missing template params if I remove them, as it claims that there are two conflicting definitions of `TreapCHeap` otherwise. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1539050768 From iwalulya at openjdk.org Tue Mar 26 11:41:21 2024 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 26 Mar 2024 11:41:21 GMT Subject: RFR: 8328997: Remove unnecessary template parameter lists in GrowableArray In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 23:55:43 GMT, Kim Barrett wrote: > Please review this change to the GrowableArray code to remove unnecessary > template parameter lists. They aren't needed, and some may become > syntactically invalid in the future. > > Testing: mach5 tier1. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18480#pullrequestreview-1960101674 From aph at openjdk.org Tue Mar 26 12:05:25 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 26 Mar 2024 12:05:25 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:42:23 GMT, Andrew Haley wrote: > @theRealAph I applied your patch and ran SA tests. I did not see the failures you mentioned to me. Is there anything else special I need to do? Yes. If you uncomment at one place, that will cause some SA failures: index 673be99f3f2..a9be5eaaec3 100644 --- a/src/hotspot/share/classfile/classFileParser.cpp +++ b/src/hotspot/share/classfile/classFileParser.cpp @@ -5997,10 +5997,10 @@ void ClassFileParser::post_process_parsed_stream(const ClassFileStream* const st // This would be very convenient, and it would allow us to save the // secondary supers in hashed order, but some SA tests know the // order of secondary supers. - // if (HashSecondarySupers) { - // // Put the transitive interfaces into hash order - // Klass::hash_secondary_supers(_transitive_interfaces, /*rewrite*/true); - // } + if (HashSecondarySupers) { + // Put the transitive interfaces into hash order + Klass::hash_secondary_supers(_transitive_interfaces, /*rewrite*/true); + } // sort methods _method_ordering = sort_methods(_methods); [runtime/LoaderConstraints/itableICCE/Test.java] [serviceability/dcmd/vm/ClassHierarchyTest.java]: Test of diagnostic command VM.class_hierarchy The only difference is the order of the interfaces. I think the tests may be invalid, but I'm not sure. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2020242215 From stefank at openjdk.org Tue Mar 26 13:04:23 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 26 Mar 2024 13:04:23 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v6] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 11:32:16 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - StefanK's suggested style changes > - Add Red Hat as contributor to VMATree Changes requested by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18289#pullrequestreview-1960295305 From stefank at openjdk.org Tue Mar 26 13:04:25 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 26 Mar 2024 13:04:25 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v2] In-Reply-To: References: Message-ID: On Wed, 20 Mar 2024 15:43:25 GMT, Stefan Karlsson wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include the mutex header > > src/hotspot/share/gc/z/zNMT.cpp line 43: > >> 41: } >> 42: void ZNMT::uncommit(zoffset offset, size_t size) { >> 43: MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > > Suggestion: > > MemTracker::free_memory_in(ZNMT::_device, untype(offset), size); Didn't this suggestion work? If not, maybe you could at least change it to `(size_t)untype(offset)`. We like to try to avoid direct c-style casts of ZGC's typed values, and instead use the functions that perform extra checks. > src/hotspot/share/gc/z/zNMT.hpp line 34: > >> 32: #include "utilities/globalDefinitions.hpp" >> 33: #include "utilities/nativeCallStack.hpp" >> 34: #include "nmt/memTracker.hpp" > > Sort order You didn't fix the sort order here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1539179386 PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1539179855 From jsjolen at openjdk.org Tue Mar 26 13:54:38 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 13:54:38 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v7] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with three additional commits since the last revision: - Merge remote-tracking branch 'origin/nmt-physical-device' into nmt-physical-device - Update src/hotspot/share/gc/z/zNMT.cpp Co-authored-by: Stefan Karlsson - Fix the sorting order for real this time. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/466ddc6e..8e1437ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=05-06 Stats: 5 lines in 2 files changed: 2 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 13:54:38 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 13:54:38 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v6] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 11:32:16 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - StefanK's suggested style changes > - Add Red Hat as contributor to VMATree OK, *now* it should be fixed. Sorry about that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2020487298 From jsjolen at openjdk.org Tue Mar 26 14:33:40 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 14:33:40 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v8] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Keep data in both arrows ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/8e1437ca..63e2c1a1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=06-07 Stats: 60 lines in 2 files changed: 17 ins; 5 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 14:45:42 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 14:45:42 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v9] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Complex is_noop() but no merge - Rename InOut to StateType ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/63e2c1a1..d332066f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=07-08 Stats: 37 lines in 2 files changed: 10 ins; 0 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 14:49:40 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 14:49:40 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v10] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Fix file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/d332066f..63e894bd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=08-09 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From bulasevich at openjdk.org Tue Mar 26 15:00:27 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Tue, 26 Mar 2024 15:00:27 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v10] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:19:39 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix AArch64 build & improve comment about InstructionMark Hi! Your change has conflict with current jdk master. Please rebase. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2020665067 From jsjolen at openjdk.org Tue Mar 26 15:07:53 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 15:07:53 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v11] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Take over leqA's out state correctly ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/63e894bd..d0983708 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From sjayagond at openjdk.org Tue Mar 26 15:10:37 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Tue, 26 Mar 2024 15:10:37 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: Message-ID: > This PR Adds SIMD support on s390x. Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: PopCountVI supported by z14 onwards. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18162/files - new: https://git.openjdk.org/jdk/pull/18162/files/ad91ab30..042c3966 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18162&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18162.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18162/head:pull/18162 PR: https://git.openjdk.org/jdk/pull/18162 From jsjolen at openjdk.org Tue Mar 26 15:19:39 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 15:19:39 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v12] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Add test - Fix test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/d0983708..13dee67a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=10-11 Stats: 19 lines in 1 file changed: 18 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 15:27:37 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 15:27:37 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v13] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Merge again - Another one ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/13dee67a..10c862ac Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=11-12 Stats: 41 lines in 2 files changed: 26 ins; 10 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From aph at openjdk.org Tue Mar 26 15:38:50 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 26 Mar 2024 15:38:50 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v6] In-Reply-To: References: Message-ID: <-z-8kVUbXif-YGurNJvBrHcDWBoxvqW69ylc3srnmuY=.7ddab4bd-371b-4aca-9c75-c8f5f280bc30@github.com> > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: - JDK-8180450: secondary_super_cache does not scale well - JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/6c596d70..366eb207 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=04-05 Stats: 6 lines in 3 files changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From lucy at openjdk.org Tue Mar 26 15:42:31 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 26 Mar 2024 15:42:31 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v6] In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 12:39:35 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > Fix cosmetic review comments Pretty well done! I have added a few comments/suggestions. Important: This implementation breaks the CompactStrings intrinsics: MacroAssembler::string_compress(...) MacroAssembler::string_expand(...) If available, these intrinsics use vector registers and vector instructions. The registers are just used, without requesting them from the register allocator. A simple remedy would be: - resize (expand) the current stack frame - save the used vector registers in the additional stack space - run the vector code - restore the used vector registers - restore the original stack frame size src/hotspot/cpu/s390/s390.ad line 1194: > 1192: } > 1193: return size; > 1194: } How about doing the size calculation like if (cbuf) { C2_MacroAssembler _masm(cbuf); int start_off = __ offset(); __ z_vlr(Rdst, Rsrc); size += __offset() - start_off; } That way, you are sure to have the correct size. And you only increase the size if you actually emit code. src/hotspot/cpu/s390/s390.ad line 11038: > 11036: %} > 11037: ins_pipe(pipe_class_dummy); > 11038: %} Move this instruct up to after vadd4I_reg, please. src/hotspot/cpu/s390/s390.ad line 11040: > 11038: %} > 11039: > 11040: instruct vsub416B_reg(vecX dst, vecX src1, vecX src2) %{ vsub416B_reg? Typo? src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 315: > 313: const int vregstosave_num = save_vectors ? (sizeof(RegisterSaver_LiveVRegs) / > 314: sizeof(RegisterSaver::LiveRegType)) > 315: : 0; I'd like to have this calculation in only one place. Don't duplicate code, in particular when it's a calculation that must yield the same result. src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 488: > 486: const int vregstosave_num = save_vectors ? (sizeof(RegisterSaver_LiveVRegs) / > 487: sizeof(RegisterSaver::LiveRegType)) > 488: : 0; see above (save_live_registers()) ------------- Changes requested by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18162#pullrequestreview-1958239716 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537933537 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1539441724 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1539446392 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537893144 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537902448 From lucy at openjdk.org Tue Mar 26 15:42:32 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 26 Mar 2024 15:42:32 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> References: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> Message-ID: On Fri, 8 Mar 2024 05:35:06 GMT, Amit Kumar wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> PopCountVI supported by z14 onwards. > > src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 485: > >> 483: offset += reg_size; >> 484: } >> 485: assert(offset == frame_size_in_bytes, "consistency check"); > > Suggestion: > > #ifdef ASSERT > assert(offset == frame_size_in_bytes, "consistency check"); > #endif // ASSERT Why this change? You use #ifdef ASSERT blocks only if it is impractical to pack all the code into an assert() macro call. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537899408 From lucy at openjdk.org Tue Mar 26 15:42:33 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 26 Mar 2024 15:42:33 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: <7YtPVteupUbFSQf2WKqTcn7_jrhyi1EqU3pjT7WXYiw=.ad1cc8df-14e0-41d0-b5a3-1f5f120c39ca@github.com> References: <89gCu4oPhDKz2-GfNWeWtJ0vXyTkmHgXhv6mVX6C72Q=.f706431a-83c4-4cce-8e1a-214779d49985@github.com> <7YtPVteupUbFSQf2WKqTcn7_jrhyi1EqU3pjT7WXYiw=.ad1cc8df-14e0-41d0-b5a3-1f5f120c39ca@github.com> Message-ID: On Mon, 18 Mar 2024 02:12:29 GMT, Amit Kumar wrote: >> why? no other platform does that. > > will there be any issue if we do: > > #elif defined(AARCH64) > > instead of > > #endif > #if defined(AARCH64) > > > The above one seems a bit better. This would just be another way to tell the preprocessor do the same thing. As it is, it's common practice at many places. Don't do "I like this better" changes in common code. That will involve you in lengthy discussions at least. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1537917531 From jsjolen at openjdk.org Tue Mar 26 15:46:51 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 15:46:51 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v14] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/10c862ac..3cbe181b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=12-13 Stats: 7 lines in 1 file changed: 0 ins; 4 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 15:46:52 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 15:46:52 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v13] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 15:27:37 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Merge again > - Another one I've updated the code to have data on both sides, but now a bunch of my tests are failing. So, I'll have to do a bunch of work to understand why those are now failing. I'm still performing merging, however. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2020786898 From jsjolen at openjdk.org Tue Mar 26 15:46:52 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 15:46:52 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v5] In-Reply-To: <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> References: <-XAziSwGMo20pUAnbdRW1JUk_0ZB-80RVfAHr0iuewE=.bff8f2f7-01e2-46eb-bd4b-1b16fccc6aa1@github.com> <3al4DjsRcIX_qJZNbTGqBDIAOj4bU5l8xpYPHQE8cNM=.7cc0bdfe-c9c8-46ce-ad42-397c61b5a603@github.com> Message-ID: On Fri, 22 Mar 2024 14:06:59 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Include os.inline.hpp > > src/hotspot/share/nmt/vmatree.hpp line 312: > >> 310: } >> 311: >> 312: SummaryDiff commit_mapping(size_t from, size_t sz, Metadata& metadata) { > > So, the `Merge` template parameter only tries to solve the problem that callers don't specify MEMFLAGS on commit or uncommit, right? > > We can > - rethink that; require all callers of commit/uncommit to provide MEMFLAGS too. I know I have been arguing against it, but I like the simpler implementation of NMT. And arguably, it closer reflects `mmap` realities, where reserving and committing don't have the same roles as in hotspot. > - don't make merging a topic for the tree. If we want the MEMFLAGS from prior reservations to carry over, just search the tree prior to registering the mapping. Since we need a "tell me the region this pointer points to" functionality anyway, just let's reuse that one. > - or, at least, revert to a simple bool or enum that describes merging behavior. Keep it simple and stupid. Right, merging isn't even a complete solution. Consider this situation: ```c++ Tree tree; tree.reserve_mapping(0, 50, mtNMT); tree.reserve_mapping(50, 50, mtTest); tree.commit_mapping(0, 100); The current merge-strategy results in: 0 mtNMT 100 | | While we *actually* wanted: 0 mtNMT 50 mtTest 100 | | | Now, we happen to be safe from this behavior as our current `VirtualMemoryTracker` doesn't allow this kind of call anyway. I really want solution 1, it just makes sense. I don't want to make that change though :-). Not right now, at least. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18289#discussion_r1539516900 From aph at openjdk.org Tue Mar 26 15:50:38 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 26 Mar 2024 15:50:38 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v7] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/366eb207..b465434b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From aph at openjdk.org Tue Mar 26 15:50:38 2024 From: aph at openjdk.org (Andrew Haley) Date: Tue, 26 Mar 2024 15:50:38 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: <3l4gFuzQsROoNsiekZMTZOuyOGn-BS-F0B5URTrAJEs=.d132a4f7-f7f5-4c41-a3de-9b0d97151304@github.com> References: <3l4gFuzQsROoNsiekZMTZOuyOGn-BS-F0B5URTrAJEs=.d132a4f7-f7f5-4c41-a3de-9b0d97151304@github.com> Message-ID: On Thu, 21 Mar 2024 16:32:34 GMT, Chris Plummer wrote: > Windows builds are failing: > > d:\a\jdk\jdk\src\hotspot\share\oops/klass.hpp(251): error C2220: the following warning is treated as an error d:\a\jdk\jdk\src\hotspot\share\oops/klass.hpp(251): warning C4267: 'return': conversion from 'size_t' to 'int', possible loss of data Fercrineoutloud, it's `constexpr`! How can there be any loss of data? ;-) Fixed, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2020801722 From mli at openjdk.org Tue Mar 26 16:15:25 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 26 Mar 2024 16:15:25 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes OK, I'm fine with it. I just hope we could have made this decision earlier, as I have asked this question before: https://github.com/openjdk/jdk/pull/16234#issuecomment-1973345855, and at that time seems there is no "No" answer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2020859623 From ihse at openjdk.org Tue Mar 26 16:36:48 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 26 Mar 2024 16:36:48 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: <2k3quQbq8nyIJJDLLezM1tUicEBbaMDQPb-7X9jvoU8=.d0e3d9c7-a4e3-466a-ab21-7f90d8ac22c7@github.com> On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes I'm not making a decision; I'm making a suggestion. Also, I believe a lot more information have come to light in the discussion we have been having. For me at least, the full scope, impact and intention of this PR was not at all clear initially. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2020932361 From kvn at openjdk.org Tue Mar 26 16:43:22 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 26 Mar 2024 16:43:22 GMT Subject: RFR: 8328181: C2: assert(MaxVectorSize >= 32) failed: vector length should be >= 32 In-Reply-To: References: Message-ID: On Sun, 24 Mar 2024 09:58:59 GMT, Jatin Bhateja wrote: > This bug fix patch tightens the predication check for small constant length clear array pattern and relaxes associated feature checks. Modified few comments for clarity. > > Kindly review and approve. > > Best Regards, > Jatin src/hotspot/cpu/x86/x86.ad line 1755: > 1753: case Op_ClearArray: > 1754: if ((size_in_bits != 512) && !VM_Version::supports_avx512vl()) { > 1755: return false; Please add comment to clarify condition. I am reading it as ClearArray will not be supported for NOT avx512 because we can have vector length 512 bits for not avx512. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18464#discussion_r1539707965 From mli at openjdk.org Tue Mar 26 17:02:27 2024 From: mli at openjdk.org (Hamlin Li) Date: Tue, 26 Mar 2024 17:02:27 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes OK, no worry. Let's discuss more thoroughly before we start implement it. Let me ask first question: are we still want sleef into jdk? * If the answer is `no` or similar, what's the better alternative solution? * If the answer is `yes`, let's continue the discussion about how to integrate sleef into jdk via this pr (maybe not imeplement it in this pr). I try to categorize the possible solutions below. 1. depends on external sleef at build time & run time, like this pr. Cons: requires sleef installed in system at build & run time. 2. depends on external sleef at run time only, like @theRealAph suggested. Cons: requires sleef installed in system at runtime 3. depends on external sleef at build time only, like @magicus suggested. Cons: requires sleef installed in system at build time 4. not depends on external sleef at all. 4.1. integrate source into jdk, use the genrated source from sleef cmake, check https://github.com/openjdk/jdk/pull/18294#issuecomment-2018839778, Cons: license legal process 4.2. is there other ways? 6. any other possible solutions not covered above? Seems to me, the best way is to integrate sleef souce into jdk. Does it need license work? If positive, how and who will work on it? Would you mind to share your comments, and make necessary additions? @theRealAph @magicus Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2020986186 From lucy at openjdk.org Tue Mar 26 17:09:26 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Tue, 26 Mar 2024 17:09:26 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v10] In-Reply-To: References: Message-ID: On Fri, 22 Mar 2024 18:19:39 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Fix AArch64 build & improve comment about InstructionMark s390 changes look good to me. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16484#pullrequestreview-1961167500 From cslucas at openjdk.org Tue Mar 26 19:01:55 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 26 Mar 2024 19:01:55 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: > ### Description > > Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. > > Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. > > The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. > > ### Benchmarking > > **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. > **Note 2:** Marging of error was negligible. > > | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | > |--------------------------------------|------------------|-------------------| > | TestTrapAfterMerge | 19.515 | 13.386 | > | TestArgEscape | 33.165 | 33.254 | > | TestCallTwoSide | 70.547 | 69.427 | > | TestCmpAfterMerge | 16.400 | 2.984 | > | TestCmpMergeWithNull_Second | 27.204 | 27.293 | > | TestCmpMergeWithNull | 8.248 | 4.920 | > | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | > | TestCondAfterMergeWithNull | 6.265 | 5.078 | > | TestCondLoadAfterMerge | 12.713 | 5.163 | > | TestConsecutiveSimpleMerge | 30.863 | 4.068 | > | TestDoubleIfElseMerge | 16.069 | 2.444 | > | TestEscapeInCallAfterMerge | 23.111 | 22.924 | > | TestGlobalEscape | 14.459 | 14.425 | > | TestIfElseInLoop | 246.061 | 42.786 | > | TestLoadAfterLoopAlias | 45.808 | 45.812 | > | TestLoadAfterTrap | 28.370 | ... Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge remote-tracking branch 'origin/master' into ram-nullables - Catching up with master - Fix broken build. - Merge with origin/master - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java Co-authored-by: Andrey Turbanov - Ammend previous fix & add repro tests. - Fix to prevent reducing already reduced Phi - Fix to prevent creating NULL ConNKlass constants. - Refrain from RAM of arrays and Phis controlled by Loop nodes. - Fix typo in test. - ... and 3 more: https://git.openjdk.org/jdk/compare/89e0889a...3129378f ------------- Changes: https://git.openjdk.org/jdk/pull/15825/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=08 Stats: 2407 lines in 13 files changed: 2152 ins; 92 del; 163 mod Patch: https://git.openjdk.org/jdk/pull/15825.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15825/head:pull/15825 PR: https://git.openjdk.org/jdk/pull/15825 From cslucas at openjdk.org Tue Mar 26 19:02:42 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 26 Mar 2024 19:02:42 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge remote-tracking branch 'origin/master' into reuse-macroasm - Fix AArch64 build & improve comment about InstructionMark - Catching up with changes in master - Catching up with origin/master - Catch up with origin/master - Merge with origin/master - Fix build, copyright dates, m4 files. - Fix merge - Catch up with master branch. Merge remote-tracking branch 'origin/master' into reuse-macroasm - Some inst_mark fixes; Catch up with master. - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 ------------- Changes: https://git.openjdk.org/jdk/pull/16484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=10 Stats: 2153 lines in 60 files changed: 136 ins; 431 del; 1586 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From jsjolen at openjdk.org Tue Mar 26 19:18:37 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 19:18:37 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v15] In-Reply-To: References: Message-ID: <_joZaDi2zxlhKXx80r6DVWFZ2S5_955eKKzJXK0KEbw=.3205caad-7b6d-4e45-b06b-ef642487201e@github.com> > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with four additional commits since the last revision: - OK, do this also - Slightly more advanced is_noop() - Do a direct assign and revert to old is_noop AGAIN - More complex is_noop() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/3cbe181b..4478a783 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=13-14 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 19:34:38 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 19:34:38 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v16] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with three additional commits since the last revision: - Don't use auto - Remove VMATree as friend class but open up the API - Simplify ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/4478a783..a7096782 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=14-15 Stats: 503 lines in 3 files changed: 214 ins; 226 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 19:40:37 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 19:40:37 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v17] In-Reply-To: References: Message-ID: > Hi, > > This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. > > ## `MemoryFileTracker` > > The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: > > ```c++ > static MemoryFile* make_device(const char* descriptive_name); > static void free_device(MemoryFile* device); > > static void allocate_memory(MemoryFile* device, size_t offset, size_t size, > MEMFLAGS flag, const NativeCallStack& stack); > static void free_memory(MemoryFile* device, size_t offset, size_t size); > > > It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: > > ```c++ > void ZNMT::reserve(zaddress_unsafe start, size_t size) { > MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); > } > void ZNMT::commit(zoffset offset, size_t size) { > MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); > } > void ZNMT::uncommit(zoffset offset, size_t size) { > MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); > } > > void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { > // NMT doesn't track mappings at the moment. > } > void ZNMT::unmap(zaddress_unsafe addr, size_t size) { > // NMT doesn't track mappings at the moment. > } > > > As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. > > This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: > > 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this... Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: - Fixes - Experiment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18289/files - new: https://git.openjdk.org/jdk/pull/18289/files/a7096782..56de7bb9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18289&range=15-16 Stats: 13 lines in 3 files changed: 0 ins; 3 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/18289.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289 PR: https://git.openjdk.org/jdk/pull/18289 From jsjolen at openjdk.org Tue Mar 26 19:50:25 2024 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 26 Mar 2024 19:50:25 GMT Subject: RFR: 8312132: Add tracking of multiple address spaces in NMT [v17] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 19:40:37 GMT, Johan Sj?len wrote: >> Hi, >> >> This PR introduces a new abstraction to NMT, named `MemoryFileTracker`. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use `mmap` to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix. >> >> ## `MemoryFileTracker` >> >> The `MemoryFileTracker` adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is: >> >> ```c++ >> static MemoryFile* make_device(const char* descriptive_name); >> static void free_device(MemoryFile* device); >> >> static void allocate_memory(MemoryFile* device, size_t offset, size_t size, >> MEMFLAGS flag, const NativeCallStack& stack); >> static void free_memory(MemoryFile* device, size_t offset, size_t size); >> >> >> It is easiest to see how this is used by looking at what ZGC's `ZNMT` class does: >> >> ```c++ >> void ZNMT::reserve(zaddress_unsafe start, size_t size) { >> MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap); >> } >> void ZNMT::commit(zoffset offset, size_t size) { >> MemTracker::allocate_memory_in(ZNMT::_device, static_cast(offset), size, mtJavaHeap, CALLER_PC); >> } >> void ZNMT::uncommit(zoffset offset, size_t size) { >> MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size); >> } >> >> void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) { >> // NMT doesn't track mappings at the moment. >> } >> void ZNMT::unmap(zaddress_unsafe addr, size_t size) { >> // NMT doesn't track mappings at the moment. >> } >> >> >> As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding `MEMFLAGS` as `committed` memory. >> >> This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are: >> >> 1. Implement VMA tracking using a balanced binary tree approach. Today's `VirtualMemoryTracker`'s usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance bo... > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Fixes > - Experiment Right, so the tests pass again but it's not clear to me that having the metadata state in both directions really can be considered necessary/more elegant. The merging is cleaned up a bit. I had a go at removing VMATree as a friend class of the Treap, which unfortunately made the build break. Still working on that. I need to regain some clarity on the necessity of merging. I think that we can change `os::commit_memory()` such that it requires a `MEMFLAGS`, in which case we can get rid of this merging. That will be some work. I could change the PR such that the API requires a `MEMFLAGS` for all operations. Thoughts? I'm off on Easter vacation, will be back on Tuesday. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18289#issuecomment-2021334021 From mdoerr at openjdk.org Tue Mar 26 20:04:28 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 26 Mar 2024 20:04:28 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 15:10:37 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > PopCountVI supported by z14 onwards. I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28 which doesn't use vector registers. The benefit was not huge if I remember correctly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2021357655 From cjplummer at openjdk.org Wed Mar 27 04:50:23 2024 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 27 Mar 2024 04:50:23 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: References: Message-ID: <2rZgu9i_hBWggC_eEEt5x1ZwffJM1habEFhiv6k0iyY=.26595a94-2b3c-4f9d-88a4-91a8450dfee5@github.com> On Tue, 26 Mar 2024 12:02:37 GMT, Andrew Haley wrote: > [runtime/LoaderConstraints/itableICCE/Test.java] [serviceability/dcmd/vm/ClassHierarchyTest.java]: Test of diagnostic command VM.class_hierarchy > > The only difference is the order of the interfaces. I think the tests may be invalid, but I'm not sure. These are not SA tests. If these failures are in addition to SA failures, please tell me how you are running the tests, because I'm not seeing them fail, even with the edits you mention above and running with "-XX:+HashSecondarySupers -XX:+UseSecondarySuperCache". This includes the two tests you mention above. They both pass for me. `serviceability/dcmd/vm/ClassHierarchyTest.java` does appear to rely on the interface order. It doesn't look like it will be hard to fix. I don't see any obvious indication that `runtime/LoaderConstraints/itableICCE/Test.java` depends on interface order, but it is possible that it does in some subtle way. You'll need to get help from someone on the runtime team. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2021925546 From sjayagond at openjdk.org Wed Mar 27 08:31:25 2024 From: sjayagond at openjdk.org (Sidraya Jayagond) Date: Wed, 27 Mar 2024 08:31:25 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v6] In-Reply-To: References: Message-ID: On Mon, 25 Mar 2024 17:07:00 GMT, Lutz Schmidt wrote: >> Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix cosmetic review comments > > src/hotspot/cpu/s390/s390.ad line 1194: > >> 1192: } >> 1193: return size; >> 1194: } > > How about doing the size calculation like > > if (cbuf) { > C2_MacroAssembler _masm(cbuf); > int start_off = __ offset(); > __ z_vlr(Rdst, Rsrc); > size += __offset() - start_off; > } > > That way, you are sure to have the correct size. And you only increase the size if you actually emit code. Thank you @RealLucy for taking a look. I have tried your suggestion however some of the jtreg tests are crashing at below line # Internal Error (output.cpp:1799), pid=846930, tid=846947 # guarantee((int)(blk_starts[i+1] - blk_starts[i]) >= (current_offset - blk_offset)) failed: shouldn't increase block size # # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.root.jdk) # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.root.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) # Problematic frame: # V [libjvm.so+0x9f8ce6] PhaseOutput::fill_buffer(CodeBuffer*, unsigned int*)+0x153e > src/hotspot/cpu/s390/sharedRuntime_s390.cpp line 315: > >> 313: const int vregstosave_num = save_vectors ? (sizeof(RegisterSaver_LiveVRegs) / >> 314: sizeof(RegisterSaver::LiveRegType)) >> 315: : 0; > > I'd like to have this calculation in only one place. Don't duplicate code, in particular when it's a calculation that must yield the same result. Yes, I fix this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1540657605 PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1540659426 From aph at openjdk.org Wed Mar 27 09:34:26 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 27 Mar 2024 09:34:26 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v5] In-Reply-To: <2rZgu9i_hBWggC_eEEt5x1ZwffJM1habEFhiv6k0iyY=.26595a94-2b3c-4f9d-88a4-91a8450dfee5@github.com> References: <2rZgu9i_hBWggC_eEEt5x1ZwffJM1habEFhiv6k0iyY=.26595a94-2b3c-4f9d-88a4-91a8450dfee5@github.com> Message-ID: On Wed, 27 Mar 2024 04:47:40 GMT, Chris Plummer wrote: > > [runtime/LoaderConstraints/itableICCE/Test.java] > > [serviceability/dcmd/vm/ClassHierarchyTest.java]: Test of diagnostic command VM.class_hierarchy > > The only difference is the order of the interfaces. I think the tests may be invalid, but I'm not sure. > > These are not SA tests. If these failures are in addition to SA failures, please tell me how you are running the tests, because I'm not seeing them fail, even with the edits you mention above and running with "-XX:+HashSecondarySupers -XX:+UseSecondarySuperCache". This includes the two tests you mention above. They both pass for me. These are the only failures I'm seeing now. > `serviceability/dcmd/vm/ClassHierarchyTest.java` does appear to rely on the interface order. It doesn't look like it will be hard to fix. OK. > I don't see any obvious indication that `runtime/LoaderConstraints/itableICCE/Test.java` depends on interface order, but it is possible that it does in some subtle way. You'll need to get help from someone on the runtime team. Mmm, interesting. OK. Thank you very much for the help. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18309#issuecomment-2022307839 From lucy at openjdk.org Wed Mar 27 11:34:25 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 27 Mar 2024 11:34:25 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: Message-ID: <1SGOMkL6TvnkQDt1WkH3FbPVrbCUOD_cA3e23QK5-jg=.b9b066f9-d50a-4710-a8a5-76c2d9b83236@github.com> On Tue, 26 Mar 2024 15:10:37 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > PopCountVI supported by z14 onwards. > I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly. Agreed. My proposed circumvention is a too dirty hack. I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the if (VM_Version::has_VectorFacility()) { } block. I would not like that, though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2022548927 From lucy at openjdk.org Wed Mar 27 11:41:24 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 27 Mar 2024 11:41:24 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 15:10:37 GMT, Sidraya Jayagond wrote: >> This PR Adds SIMD support on s390x. > > Sidraya Jayagond has updated the pull request incrementally with one additional commit since the last revision: > > PopCountVI supported by z14 onwards. Decision pending how to handle string_compress and string_expand implementations. **I suggest using KILL** effects for the used vector registers. Default register allocation does not work because the code depends on registers with subsequent numbers. ------------- Changes requested by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18162#pullrequestreview-1962975031 From shade at openjdk.org Wed Mar 27 11:41:28 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 27 Mar 2024 11:41:28 GMT Subject: RFR: 8329134: Reconsider TLAB zapping Message-ID: We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. It also allows to remove the related Zero kludge. Additional testing: - [ ] Linux AArch64 server fastdebug, `all` tests - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass ------------- Commit messages: - Touchups - Also remove Zero kludge - Fix Changes: https://git.openjdk.org/jdk/pull/18500/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18500&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329134 Stats: 25 lines in 5 files changed: 3 ins; 13 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/18500.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18500/head:pull/18500 PR: https://git.openjdk.org/jdk/pull/18500 From lucy at openjdk.org Wed Mar 27 11:41:25 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 27 Mar 2024 11:41:25 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v6] In-Reply-To: References: Message-ID: <8VbqDq6J8Qj5ZUnbBvUjmDTF7o87KsbJkh59FJ-y07o=.aa205920-dcb0-4a7b-a5e3-28edd289fa86@github.com> On Wed, 27 Mar 2024 08:27:47 GMT, Sidraya Jayagond wrote: >> src/hotspot/cpu/s390/s390.ad line 1194: >> >>> 1192: } >>> 1193: return size; >>> 1194: } >> >> How about doing the size calculation like >> >> if (cbuf) { >> C2_MacroAssembler _masm(cbuf); >> int start_off = __ offset(); >> __ z_vlr(Rdst, Rsrc); >> size += __offset() - start_off; >> } >> >> That way, you are sure to have the correct size. And you only increase the size if you actually emit code. > > Thank you @RealLucy for taking a look. > I have tried your suggestion however some of the jtreg tests are crashing at below line > > # Internal Error (output.cpp:1799), pid=846930, tid=846947 > # guarantee((int)(blk_starts[i+1] - blk_starts[i]) >= (current_offset - blk_offset)) failed: shouldn't increase block size > # > # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.root.jdk) > # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.root.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x) > # Problematic frame: > # V [libjvm.so+0x9f8ce6] PhaseOutput::fill_buffer(CodeBuffer*, unsigned int*)+0x153e Ahh, sorry. Forget this suggestion. Your original code is correct. There are two modes this code is called. One is to purely find out the size of the to be emitted code. In this case (chuf == nullptr) you have to return the correct size anyway. The other mode is the actual code emit call (chuff != nullptr). The correct size has to be returned here as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18162#discussion_r1540885603 From mdoerr at openjdk.org Wed Mar 27 13:27:31 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 27 Mar 2024 13:27:31 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: <1SGOMkL6TvnkQDt1WkH3FbPVrbCUOD_cA3e23QK5-jg=.b9b066f9-d50a-4710-a8a5-76c2d9b83236@github.com> References: <1SGOMkL6TvnkQDt1WkH3FbPVrbCUOD_cA3e23QK5-jg=.b9b066f9-d50a-4710-a8a5-76c2d9b83236@github.com> Message-ID: On Wed, 27 Mar 2024 11:31:55 GMT, Lutz Schmidt wrote: > > I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly. > > Agreed. My proposed circumvention is a too dirty hack. > > I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. > > Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the > > ``` > if (VM_Version::has_VectorFacility()) { > } > ``` > > block. I would not like that, though. I didn't mean to back out the whole commit. Only the implementation of string_compress and string_expand. The benefit of the vector version certainly depends on what kind of strings are used. (Effect may also be negative in some cases.) I think that classical benchmarks didn't show a significant performance impact, but I don't remember exactly, either. I'll leave the s390 maintainers free to decide if they want to adapt the vector version or go for the short and simple implementation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2022758192 From stefank at openjdk.org Wed Mar 27 14:04:29 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 27 Mar 2024 14:04:29 GMT Subject: RFR: 8329134: Reconsider TLAB zapping In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 21:08:16 GMT, Aleksey Shipilev wrote: > We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. > > There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. > > Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. > > It also allows to remove the related Zero kludge. > > Additional testing: > - [ ] Linux AArch64 server fastdebug, `all` tests > - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass This looks like a worth-while change. I listed a couple of nits that I think would be nice to clean up. src/hotspot/share/gc/shared/memAllocator.cpp line 323: > 321: Copy::zero_to_words(mem, allocation._allocated_tlab_size); > 322: } else if (ZapTLAB) { > 323: // ...and zap just allocated TLAB. Could you add a blank line after this, so that the two separate comments don't get bunched together? src/hotspot/share/gc/shared/threadLocalAllocBuffer.inline.hpp line 43: > 41: if (pointer_delta(end(), obj) >= size) { > 42: // successful thread-local allocation > 43: #ifdef ASSERT Could you add a blank line after this, so that the two separate comments don't get bunched together? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 884: > 882: Copy::zero_to_words(gclab_buf, actual_size); > 883: } else if (ZapTLAB) { > 884: // ...and zap just allocated TLAB. Could you add a blank line after this, so that the two separate comments don't get bunched together? src/hotspot/share/interpreter/zero/bytecodeInterpreter.cpp line 1996: > 1994: // Initialize object field block. > 1995: // If TLAB was pre-zeroed, we can skip this path. > 1996: if (!ZeroTLAB) { "skip this path" refers to the code inside the if block. I think it would be nicer to just place a comment in that block, so that we don't have to figure out what "this path" means. Maybe something like: if (!ZeroTLAB) { // The TLAB was not pre-zeroed, so clear the memory here. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18500#pullrequestreview-1963374309 PR Review Comment: https://git.openjdk.org/jdk/pull/18500#discussion_r1541141628 PR Review Comment: https://git.openjdk.org/jdk/pull/18500#discussion_r1541142044 PR Review Comment: https://git.openjdk.org/jdk/pull/18500#discussion_r1541145082 PR Review Comment: https://git.openjdk.org/jdk/pull/18500#discussion_r1541159636 From stefank at openjdk.org Wed Mar 27 14:04:30 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 27 Mar 2024 14:04:30 GMT Subject: RFR: 8329134: Reconsider TLAB zapping In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 13:42:08 GMT, Stefan Karlsson wrote: >> We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. >> >> There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. >> >> Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. >> >> It also allows to remove the related Zero kludge. >> >> Additional testing: >> - [ ] Linux AArch64 server fastdebug, `all` tests >> - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass > > src/hotspot/share/gc/shared/memAllocator.cpp line 323: > >> 321: Copy::zero_to_words(mem, allocation._allocated_tlab_size); >> 322: } else if (ZapTLAB) { >> 323: // ...and zap just allocated TLAB. > > Could you add a blank line after this, so that the two separate comments don't get bunched together? Rereading the patch, I also so that we have: // ..and clear it. ... // ...and zap just allocated TLAB. Maybe just unify this into: // ..and clear it. ... // ...and zap it. Alternatively: // ..and clear just allocated TLAB. ... // ...and zap just allocated TLAB. And while you are changing this. What's up with the `..` vs `...`? (rhetoric question) I'd prefer if we just picked either of those. This comment also applies to the duplicated code in shenandoahHeap.cpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18500#discussion_r1541177284 From aph at openjdk.org Wed Mar 27 14:15:39 2024 From: aph at openjdk.org (Andrew Haley) Date: Wed, 27 Mar 2024 14:15:39 GMT Subject: RFR: JDK-8180450: secondary_super_cache does not scale well [v8] In-Reply-To: References: Message-ID: > This PR is a redesign of subtype checking. > > The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. > > So what's changed, so that the old design should be replaced? > > Firstly, the computers of today aren't the computers of twenty years ago. It's not merely a matter of speed: the systems are much more parallel, both in the sense of having more cores and each core can run many instructions in parallel. Because of this, the speed ratio between memory accesses and the rate at which we can execute instructions has become wider and wider. > > The most severe reported problem is to do with the "secondary supers cache". This is a 1-element per-class cache for interfaces (and arrays of interfaces). Unfortunately, if two threads repeatedly update this cache, the result is that a cache line ping-pongs between cores, causing a severe slowdown. > > Also, the linear search for an interface that is absent means that the entire list of interfaces has to be scanned. This plays badly with newer language features such as JEP 406, pattern matching for switch. > > However, the computers of today can help us. The very high instruction-per-cycle rate of a Great Big Out-Of-Order (GBOOO) processor allows us to execute many of the instructions of a hash table lookup in parallel, as long as we avoid dependencies between instructions. > > The solution > ------------ > > We use a hashed lookup of secondary supers. This is a 64-way hash table, with linear probing for collisions. The table is compressed, in that null entries are removed, and the resulting hash table fits into the same secondary supers array as today's unsorted array of secondary supers. This means that existing code in HotSpot that simply does a linear scan of the secondary supers array does not need to be altered. > > We add a bitmap field to each Klass object. This bitmap contains an occupancy bit corresponding to each element of the hash table, with a 1 indicating element presence. As well as allowing the hash table to be decompressed, this bimap is used as a simple kind of Bloom Filter. To determine whether a superclass is present, we simply have to check a single bit in the bitmap. If the bit is clear, we know that the superclass is not present. If the bit is set, we have to do a little arithmetic and then consult the hash table. > > It works like this: > > > mov sub_klass, [& sub_klass-... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: JDK-8180450: secondary_super_cache does not scale well ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18309/files - new: https://git.openjdk.org/jdk/pull/18309/files/b465434b..bf37a658 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18309&range=06-07 Stats: 128 lines in 3 files changed: 105 ins; 16 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/18309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18309/head:pull/18309 PR: https://git.openjdk.org/jdk/pull/18309 From ihse at openjdk.org Wed Mar 27 14:49:29 2024 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 27 Mar 2024 14:49:29 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes At this point, I think I am not really qualified to continue the discussion. I will ask around inside Oracle to see if I can find someone who better understands the implications of the suggested libsleef integration (with or without source code incorporation). I can return when there is some common understanding on how to proceed, if that will lead to any build system changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2022961160 From shade at openjdk.org Wed Mar 27 14:55:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 27 Mar 2024 14:55:35 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v2] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 14:00:56 GMT, Stefan Karlsson wrote: >> src/hotspot/share/gc/shared/memAllocator.cpp line 323: >> >>> 321: Copy::zero_to_words(mem, allocation._allocated_tlab_size); >>> 322: } else if (ZapTLAB) { >>> 323: // ...and zap just allocated TLAB. >> >> Could you add a blank line after this, so that the two separate comments don't get bunched together? > > Rereading the patch, I also so that we have: > > // ..and clear it. > ... > // ...and zap just allocated TLAB. > > Maybe just unify this into: > > // ..and clear it. > ... > // ...and zap it. > > Alternatively: > > // ..and clear just allocated TLAB. > ... > // ...and zap just allocated TLAB. > > > And while you are changing this. What's up with the `..` vs `...`? (rhetoric question) I'd prefer if we just picked either of those. > > This comment also applies to the duplicated code in shenandoahHeap.cpp. I meditated about this code a bit, and I think we have a cleaner third option, see new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18500#discussion_r1541280635 From shade at openjdk.org Wed Mar 27 14:55:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 27 Mar 2024 14:55:35 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v2] In-Reply-To: References: Message-ID: > We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. > > There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. > > Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. > > It also allows to remove the related Zero kludge. > > Additional testing: > - [x] Linux AArch64 server fastdebug, `all` tests > - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18500/files - new: https://git.openjdk.org/jdk/pull/18500/files/8fe98c4c..72bf1e8a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18500&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18500&range=00-01 Stats: 10 lines in 4 files changed: 4 ins; 5 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18500.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18500/head:pull/18500 PR: https://git.openjdk.org/jdk/pull/18500 From mli at openjdk.org Wed Mar 27 15:52:35 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 27 Mar 2024 15:52:35 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Wed, 27 Mar 2024 14:47:00 GMT, Magnus Ihse Bursie wrote: > At this point, I think I am not really qualified to continue the discussion. > > I will ask around inside Oracle to see if I can find someone who better understands the implications of the suggested libsleef integration (with or without source code incorporation). I can return when there is some common understanding on how to proceed, if that will lead to any build system changes. OK, Thanks for discussion and reviewing! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2023113752 From mli at openjdk.org Wed Mar 27 16:03:28 2024 From: mli at openjdk.org (Hamlin Li) Date: Wed, 27 Mar 2024 16:03:28 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes Hello @PaulSandoz, For various reasons, [previous pr](https://github.com/openjdk/jdk/pull/16234) and current pr are thought not good enough, so I did some exploration of [integrating sleef source into jdk](https://github.com/openjdk/jdk/pull/18294#issuecomment-2018839778), the solution itself is simple and maintainace is easy. I saw in the previous discussion, you had some [comments about integrate sleef source into jdk](https://github.com/openjdk/jdk/pull/16234#issuecomment-1821885433). I'm not sure if you're avaialbe to take a look and continue the discussion and share your opinions? If positive, we can start from [here](https://github.com/openjdk/jdk/pull/18294#issuecomment-2020986186). Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2023136169 From bulasevich at openjdk.org Wed Mar 27 16:15:28 2024 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 27 Mar 2024 16:15:28 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 19:02:42 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Fix AArch64 build & improve comment about InstructionMark > - Catching up with changes in master > - Catching up with origin/master > - Catch up with origin/master > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 FYI. Something goes wrong with the change on ARM32. # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/ws/workspace/jdk-dev/label/linux-arm/type/b11/jdk/src/hotspot/share/asm/codeBuffer.hpp:163), pid=10782, tid=10796 # assert(_mark != nullptr) failed: not an offset # # JRE version: OpenJDK Runtime Environment (23.0) (fastdebug build 23-commit8fc9097b-adhoc.re.jdk) # Java VM: OpenJDK Server VM (fastdebug 23-commit8fc9097b-adhoc.re.jdk, mixed mode, g1 gc, linux-arm) # Problematic frame: # V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /ws/workspace/jdk-dev-jtreg/label/linux-arm/suite/jdk-tier1/type/t11/core.10782) # # If you would like to submit a bug report, please visit: # https://bugreport.java.com/bugreport/crash.jsp # --------------- S U M M A R Y ------------ Command Line: Host: vm-ubuntu-16v4-aarch64-1, ARM, 4 cores, 7G, Ubuntu 16.04.7 LTS Time: Wed Mar 27 07:16:41 2024 UTC elapsed time: 0.097440 seconds (0d 0h 0m 0s) --------------- T H R E A D --------------- Current thread (0xb120cd10): JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=10796, stack(0xb1090000,0xb1110000) (512K)] Stack: [0xb1090000,0xb1110000], sp=0xb110d200, free space=500k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac (codeBuffer.hpp:163) V [libjvm.so+0x14ca28] CallRuntimeDirectNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const+0x80 V [libjvm.so+0x10ed850] PhaseOutput::scratch_emit_size(Node const*)+0x37c V [libjvm.so+0x10e5f64] PhaseOutput::shorten_branches(unsigned int*)+0x274 V [libjvm.so+0x10f6dcc] PhaseOutput::Output()+0x488 V [libjvm.so+0x699b54] Compile::Code_Gen()+0x424 V [libjvm.so+0x69a87c] Compile::Compile(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool, DirectiveSet*)+0xb3c V [libjvm.so+0x122e73c] OptoRuntime::generate_stub(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool)+0xb4 V [libjvm.so+0x122ec68] OptoRuntime::generate(ciEnv*)+0x50 V [libjvm.so+0x4994cc] C2Compiler::init_c2_runtime()+0x104 V [libjvm.so+0x4996dc] C2Compiler::initialize()+0x9c V [libjvm.so+0x6a371c] CompileBroker::init_compiler_runtime()+0x35c V [libjvm.so+0x6ab90c] CompileBroker::compiler_thread_loop()+0x178 V [libjvm.so+0xb30c94] JavaThread::thread_main_inner()+0x27c V [libjvm.so+0x13b3aa0] Thread::call_run()+0x130 V [libjvm.so+0x10c98bc] thread_native_entry(Thread*)+0x13c ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2023165642 From stefank at openjdk.org Wed Mar 27 16:28:22 2024 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 27 Mar 2024 16:28:22 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v2] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 14:55:35 GMT, Aleksey Shipilev wrote: >> We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. >> >> There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. >> >> Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. >> >> It also allows to remove the related Zero kludge. >> >> Additional testing: >> - [x] Linux AArch64 server fastdebug, `all` tests >> - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Review comments Thanks! Looks good. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18500#pullrequestreview-1963856718 From lucy at openjdk.org Wed Mar 27 16:40:26 2024 From: lucy at openjdk.org (Lutz Schmidt) Date: Wed, 27 Mar 2024 16:40:26 GMT Subject: RFR: 8327652: S390x: Implements SLP support [v7] In-Reply-To: References: <1SGOMkL6TvnkQDt1WkH3FbPVrbCUOD_cA3e23QK5-jg=.b9b066f9-d50a-4710-a8a5-76c2d9b83236@github.com> Message-ID: On Wed, 27 Mar 2024 13:24:39 GMT, Martin Doerr wrote: > ... I think that classical benchmarks didn't show a significant performance impact, but I don't remember exactly, either. ... Yes, you need "long" strings (>= 32 characters) for the vector code to kick in. Once it kicks in, there is a performance improvement. Hard to say which applications might benefit. For too short strings I do not expect a visible performance penalty. It's just a shift and a branch. Let the maintainers decide. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2023244461 From cslucas at openjdk.org Wed Mar 27 17:05:28 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 27 Mar 2024 17:05:28 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v11] In-Reply-To: References: Message-ID: <8G-DywP4PPoHwjFOf44Ho6mc4iwPYtyfGPzSByRK6zY=.2a47a06d-8039-4fde-bd27-b3c938fcae87@github.com> On Wed, 27 Mar 2024 16:12:26 GMT, Boris Ulasevich wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Fix AArch64 build & improve comment about InstructionMark >> - Catching up with changes in master >> - Catching up with origin/master >> - Catch up with origin/master >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/89e0889a...b4d73c98 > > FYI. Something goes wrong with the change on ARM32. > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/ws/workspace/jdk-dev/label/linux-arm/type/b11/jdk/src/hotspot/share/asm/codeBuffer.hpp:163), pid=10782, tid=10796 > # assert(_mark != nullptr) failed: not an offset > # > # JRE version: OpenJDK Runtime Environment (23.0) (fastdebug build 23-commit8fc9097b-adhoc.re.jdk) > # Java VM: OpenJDK Server VM (fastdebug 23-commit8fc9097b-adhoc.re.jdk, mixed mode, g1 gc, linux-arm) > # Problematic frame: > # V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to /ws/workspace/jdk-dev-jtreg/label/linux-arm/suite/jdk-tier1/type/t11/core.10782) > # > # If you would like to submit a bug report, please visit: > # https://bugreport.java.com/bugreport/crash.jsp > # > > --------------- S U M M A R Y ------------ > > Command Line: > > Host: vm-ubuntu-16v4-aarch64-1, ARM, 4 cores, 7G, Ubuntu 16.04.7 LTS > Time: Wed Mar 27 07:16:41 2024 UTC elapsed time: 0.097440 seconds (0d 0h 0m 0s) > > --------------- T H R E A D --------------- > > Current thread (0xb120cd10): JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=10796, stack(0xb1090000,0xb1110000) (512K)] > > Stack: [0xb1090000,0xb1110000], sp=0xb110d200, free space=500k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x136ccc] emit_call_reloc(C2_MacroAssembler*, MachCallNode const*, MachOper*, RelocationHolder const&)+0x2ac (codeBuffer.hpp:163) > V [libjvm.so+0x14ca28] CallRuntimeDirectNode::emit(C2_MacroAssembler*, PhaseRegAlloc*) const+0x80 > V [libjvm.so+0x10ed850] PhaseOutput::scratch_emit_size(Node const*)+0x37c > V [libjvm.so+0x10e5f64] PhaseOutput::shorten_branches(unsigned int*)+0x274 > V [libjvm.so+0x10f6dcc] PhaseOutput::Output()+0x488 > V [libjvm.so+0x699b54] Compile::Code_Gen()+0x424 > V [libjvm.so+0x69a87c] Compile::Compile(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool, DirectiveSet*)+0xb3c > V [libjvm.so+0x122e73c] OptoRuntime::generate_stub(ciEnv*, TypeFunc const* (*)(), unsigned char*, char const*, int, bool, bool)+0xb4 > V [libjvm.so+0x122ec68] OptoRuntime::generate(ciEnv*)+0x50 > V [libjvm.so+0x4994cc] C2Compiler::init_c2_runtime()+0x104 > V [libjvm.so+0x4996dc] C2Compiler::initialize()+0x9c > V [libjvm.so+... Thanks for testing @bulasevich . I'm going to take a look and get back to you asap. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-2023322118 From cslucas at openjdk.org Wed Mar 27 17:06:27 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 27 Mar 2024 17:06:27 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v8] In-Reply-To: <3lNNenuz-e6q84ZJmTxQ425Z31mNXp5e5PqhhfXSUPs=.59818092-73c1-4498-a76c-d621a9420e46@github.com> References: <_DCkU0ejXoVIyAy1q_anbEoXkSWtMHuczrTGitIi1X8=.3c39f56e-d4bf-4673-a325-c6ac02620e79@github.com> <3lNNenuz-e6q84ZJmTxQ425Z31mNXp5e5PqhhfXSUPs=.59818092-73c1-4498-a76c-d621a9420e46@github.com> Message-ID: On Fri, 22 Mar 2024 00:53:34 GMT, Vladimir Kozlov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Catching up with master >> - Fix broken build. >> - Merge with origin/master >> - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java >> >> Co-authored-by: Andrey Turbanov >> - Ammend previous fix & add repro tests. >> - Fix to prevent reducing already reduced Phi >> - Fix to prevent creating NULL ConNKlass constants. >> - Refrain from RAM of arrays and Phis controlled by Loop nodes. >> - Fix typo in test. >> - Fix build after merge. >> - ... and 2 more: https://git.openjdk.org/jdk/compare/7231fd78...620f9da8 > > Okay. I submitted our testing. Thank you so much for helping @vnkozlov ! Who else do you think can help reviewing this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2023324654 From simonis at openjdk.org Wed Mar 27 17:32:32 2024 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 27 Mar 2024 17:32:32 GMT Subject: RFR: 8329204: Diagnostic command for zeroing unused parts of the heap Message-ID: Diagnostic command for zeroing unused parts of the heap I propose to add a new diagnostic command `System.zero_unused_memory` which zeros out all unused parts of the heap. The name of the command is intentionally GC/heap agnostic because in the future it might be extended to also zero unused parts of the Metaspace and/or CodeCache. Currently `System.zero_unused_memory` triggers a full GC and afterwards zeros unused parts of the heap. Zeroing can help snapshotting technologies like [CRIU][1] or [Firecracker][2] to shrink the snapshot size of VMs/containers with running JVM processes because pages which only contain zero bytes can be easily removed from the image by making the image *sparse* (e.g. with [`fallocate -p`][3]). Notice that uncommitting unused heap parts in the JVM doesn't help in the context of virtualization (e.g. KVM/Firecracker) because from the host perspective they are still dirty and can't be easily removed from the snapshot image because they usually contain some non-zero data. More details can be found in my FOSDEM talk ["Zeroing and the semantic gap between host and guest"][4]. Furthermore, removing pages which only contain zero bytes (i.e. "empty pages") from a snapshot image not only decreases the image size but also speeds up the restore process because empty pages don't have to be read from the image file but will be populated by the kernel zero page first until they are used for the first time. This also decreases the initial memory footprint of a restored process. An additional argument for memory zeroing is security. By zeroing unused heap parts, we can make sure that secrets contained in unreferenced Java objects are deleted. Something that's currently impossibly to achieve from Java because even if a Java program zeroes out arrays with sensitive data after usage, it can never guarantee that the corresponding object hasn't already been moved by the GC and an old, unreferenced copy of that data still exists somewhere in the heap. A prototype implementation for this proposal for Serial, Parallel, G1 and Shenandoah GC is available in the linked pull request. [1]: https://criu.org [2]: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md [3]: https://man7.org/linux/man-pages/man1/fallocate.1.html [4]: https://fosdem.org/2024/schedule/event/fosdem-2024-3454-zeroing-and-the-semantic-gap-between-host-and-guest/ ------------- Commit messages: - Make VM_ZeroUnusedMemory a VM_GC_Sync_Operation - Implement unused memory zeroing for ShenadoahGC and move the zeroing part into a VM operation - Implement unused memory zeroing for G1GC - Implement unused memory zeroing for ParallelGC - 8329204: Diagnostic command for zeroing unused parts of the heap Changes: https://git.openjdk.org/jdk/pull/18521/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18521&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329204 Stats: 187 lines in 29 files changed: 187 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18521.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18521/head:pull/18521 PR: https://git.openjdk.org/jdk/pull/18521 From vlivanov at openjdk.org Wed Mar 27 17:44:28 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Wed, 27 Mar 2024 17:44:28 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 19:01:55 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: > > - Merge remote-tracking branch 'origin/master' into ram-nullables > - Catching up with master > - Fix broken build. > - Merge with origin/master > - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java > > Co-authored-by: Andrey Turbanov > - Ammend previous fix & add repro tests. > - Fix to prevent reducing already reduced Phi > - Fix to prevent creating NULL ConNKlass constants. > - Refrain from RAM of arrays and Phis controlled by Loop nodes. > - Fix typo in test. > - ... and 3 more: https://git.openjdk.org/jdk/compare/89e0889a...3129378f Overall, looks good. And a nice set of test cases! Just minor stylistic comments. src/hotspot/share/code/debugInfo.hpp line 136: > 134: Handle _value; > 135: bool _visited; > 136: bool _was_scalar_replaced; // Whether this ObjectValue describes an object scalar replaced or just Name it `is_scalar_replaced` maybe? src/hotspot/share/opto/escape.cpp line 472: > 470: > 471: // Don't handle arrays. > 472: if (alloc->Opcode() != Op_Allocate) { Turn it into `alloc->Opcode() == Op_AllocateArray` or put `assert(alloc->Opcode() == Op_AllocateArray, "")` inside the branch. src/hotspot/share/opto/escape.cpp line 2976: > 2974: // > 2975: if (field->base_count() > 1 && candidates.size() == 0) { > 2976: bool further_validate = false; A better name maybe? You can also extract its computation into a helper method (e.g. `has_non_reducible_merge(FieldNode* field)`. src/hotspot/share/opto/memnode.cpp line 1552: > 1550: // by 'split_through_phi'. The first use of this method was in EA code as part > 1551: // of simplification of allocation merges. > 1552: // Some differences from original method: What "original method" do you refer to here? src/hotspot/share/opto/memnode.cpp line 1559: > 1557: intptr_t ignore = 0; > 1558: Node* base = AddPNode::Ideal_base_and_offset(address, phase, ignore); > 1559: base = (base->is_CastPP()) ? base->in(1) : base; With proposed identation It's hard to read. I'd go even further and turn it into an if: if (base->is_CastPP()) { base = base->in(1); } ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15825#pullrequestreview-1961704899 PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1540097586 PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1540096986 PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1540093774 PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1541553698 PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1540092689 From kvn at openjdk.org Wed Mar 27 17:58:25 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 27 Mar 2024 17:58:25 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 20:51:33 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: >> >> - Merge remote-tracking branch 'origin/master' into ram-nullables >> - Catching up with master >> - Fix broken build. >> - Merge with origin/master >> - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java >> >> Co-authored-by: Andrey Turbanov >> - Ammend previous fix & add repro tests. >> - Fix to prevent reducing already reduced Phi >> - Fix to prevent creating NULL ConNKlass constants. >> - Refrain from RAM of arrays and Phis controlled by Loop nodes. >> - Fix typo in test. >> - ... and 3 more: https://git.openjdk.org/jdk/compare/89e0889a...3129378f > > src/hotspot/share/opto/escape.cpp line 472: > >> 470: >> 471: // Don't handle arrays. >> 472: if (alloc->Opcode() != Op_Allocate) { > > Turn it into `alloc->Opcode() == Op_AllocateArray` or put `assert(alloc->Opcode() == Op_AllocateArray, "")` inside the branch. I suggest to add assert inside branch to catch the case when we add an other Allocate subclass. Also `ptn->ideal_node()->is_Allocate()` check at the line 468 is not needed because `as_Allocate()` has such assert check. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1541592986 From simonis at openjdk.org Wed Mar 27 18:10:34 2024 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 27 Mar 2024 18:10:34 GMT Subject: RFR: 8329204: Diagnostic command for zeroing unused parts of the heap [v2] In-Reply-To: References: Message-ID: > Diagnostic command for zeroing unused parts of the heap > > I propose to add a new diagnostic command `System.zero_unused_memory` which zeros out all unused parts of the heap. The name of the command is intentionally GC/heap agnostic because in the future it might be extended to also zero unused parts of the Metaspace and/or CodeCache. > > Currently `System.zero_unused_memory` triggers a full GC and afterwards zeros unused parts of the heap. Zeroing can help snapshotting technologies like [CRIU][1] or [Firecracker][2] to shrink the snapshot size of VMs/containers with running JVM processes because pages which only contain zero bytes can be easily removed from the image by making the image *sparse* (e.g. with [`fallocate -p`][3]). > > Notice that uncommitting unused heap parts in the JVM doesn't help in the context of virtualization (e.g. KVM/Firecracker) because from the host perspective they are still dirty and can't be easily removed from the snapshot image because they usually contain some non-zero data. More details can be found in my FOSDEM talk ["Zeroing and the semantic gap between host and guest"][4]. > > Furthermore, removing pages which only contain zero bytes (i.e. "empty pages") from a snapshot image not only decreases the image size but also speeds up the restore process because empty pages don't have to be read from the image file but will be populated by the kernel zero page first until they are used for the first time. This also decreases the initial memory footprint of a restored process. > > An additional argument for memory zeroing is security. By zeroing unused heap parts, we can make sure that secrets contained in unreferenced Java objects are deleted. Something that's currently impossibly to achieve from Java because even if a Java program zeroes out arrays with sensitive data after usage, it can never guarantee that the corresponding object hasn't already been moved by the GC and an old, unreferenced copy of that data still exists somewhere in the heap. > > A prototype implementation for this proposal for Serial, Parallel, G1 and Shenandoah GC is available in the linked pull request. > > [1]: https://criu.org > [2]: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md > [3]: https://man7.org/linux/man-pages/man1/fallocate.1.html > [4]: https://fosdem.org/2024/schedule/event/fosdem-2024-3454-zeroing-and-the-semantic-gap-between-host-and-guest/ Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: Fix build error on MacOs (with clang, if we use 'override' for a virtual method we have to use it for all methods to avoid '-Winconsistent-missing-override') ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18521/files - new: https://git.openjdk.org/jdk/pull/18521/files/b06aa327..4264e53d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18521&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18521&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/18521.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18521/head:pull/18521 PR: https://git.openjdk.org/jdk/pull/18521 From coleenp at openjdk.org Wed Mar 27 23:47:32 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 27 Mar 2024 23:47:32 GMT Subject: RFR: 8328507: Move StackWatermark code from safepoint cleanup [v2] In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 14:41:33 GMT, Erik ?sterlund wrote: >> There is some stack watermark code in safepoint cleanup that flushes out concurrent stack processing, as a defensive measure to guard safepoint operations that assume the stacks have been fixed up. However, we wish to get rid of safepoint cleanup so this operation should move out from the safepoint. This proposal moves it into CollectedHeap::safepoint_synchronize_begin instead which runs before we start shutting down Java threads. This moves the code into more obvious GC places and reduces time spent inside of non-GC safepoints. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Remove incorrect assert This makes sense. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18378#pullrequestreview-1964934082 From sviswanathan at openjdk.org Thu Mar 28 00:07:32 2024 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 28 Mar 2024 00:07:32 GMT Subject: RFR: 8320794: Emulate rest of vblendvp[sd] on ECore [v2] In-Reply-To: References: <8ajDeYtrlyZUXnTl29xwLr1rwGIYzjj5wThm9yjrBVY=.c75c1992-c836-4969-aea4-e3cbf428dfad@github.com> Message-ID: On Tue, 19 Mar 2024 15:02:32 GMT, Volodymyr Paprotski wrote: >> Replace vpblendvp[sd] with macro assembler call and test in: >> - `C2_MacroAssembler::vector_cast_float_to_int_special_cases_avx` (insufficient registers for 1 of 2 blends) >> - `C2_MacroAssembler::vector_cast_double_to_int_special_cases_avx` >> - `C2_MacroAssembler::vector_count_leading_zeros_int_avx` >> >> Functional testing with existing and new tests: >> `make test TEST="test/hotspot/jtreg/compiler/vectorapi/reshape test/hotspot/jtreg/compiler/vectorization/runner/BasicIntOpTest.java"` >> >> Benchmarking with existing and new tests: >> >> make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microFloat256ToInteger256" >> make test TEST="micro:org.openjdk.bench.jdk.incubator.vector.VectorFPtoIntCastOperations.microDouble256ToInteger256" >> make test TEST="micro:org.openjdk.bench.vm.compiler.VectorBitCount.WithSuperword.intLeadingZeroCount" >> >> >> Performance before: >> >> Benchmark (SIZE) Mode Cnt Score Error Units >> VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 17271.078 ? 184.140 ops/ms >> VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9310.507 ? 88.136 ops/ms >> VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11137.594 ? 19.009 ops/ms >> VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 5425.001 ? 3.136 ops/ms >> VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 0.994 ? 0.002 ops/us >> >> >> Performance after: >> >> Benchmark (SIZE) Mode Cnt Score Error Units >> VectorFPtoIntCastOperations.microDouble256ToInteger256 512 thrpt 5 19222.048 ? 87.622 ops/ms >> VectorFPtoIntCastOperations.microDouble256ToInteger256 1024 thrpt 5 9233.245 ? 123.493 ops/ms >> VectorFPtoIntCastOperations.microFloat256ToInteger256 512 thrpt 5 11672.806 ? 10.854 ops/ms >> VectorFPtoIntCastOperations.microFloat256ToInteger256 1024 thrpt 5 6009.735 ? 12.173 ops/ms >> VectorBitCount.WithSuperword.intLeadingZeroCount 1024 0 thrpt 4 1.039 ? 0.004 ops/us > > Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: > > Fix double pasted test src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4871: > 4869: vpxor(xtmp3, xtmp2, xtmp4, vec_enc); > 4870: > 4871: vblendvps(dst, dst, xtmp1, xtmp3, vec_enc, true, xtmp4); The vblendvps at line 4861 could also be emulated: From: vpxor(xtmp4, xtmp4, xtmp4, vec_enc); vcmpps(xtmp3, src, src, Assembler::UNORD_Q, vec_enc); vblendvps(dst, dst, xtmp4, xtmp3, vec_enc); To: vpxor(xtmp4, xtmp4, xtmp4, vec_enc); vcmpps(xtmp3, src, src, Assembler::UNORD_Q, vec_enc); vblendvps(dst, dst, xtmp4, xtmp3, vec_enc, false, xtmp4); src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3524: > 3522: bool blend_emulation = EnableX86ECoreOpts && UseAVX > 1; > 3523: bool scratch_available = scratch != xnoreg && scratch != src1 && scratch != src2 && scratch != dst; > 3524: bool dst_available = (dst != mask || compute_mask) && (dst != src1 || dst != src2); There are two paths here: Path 1: When compute_mask == true scratch_available = (scratch != xnoreg) && (scratch != src1) && (scratch != src2) && (scratch != dst); dst_available = (dst != mask) && (dst != src1 || dst != src2); Path 2: When compute_mask == false scratch_available = (scratch != xnoreg) && (scratch != dst); dst_available = (dst != mask) && (dst != src1 || dst != src2); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18310#discussion_r1542167094 PR Review Comment: https://git.openjdk.org/jdk/pull/18310#discussion_r1542164547 From dlong at openjdk.org Thu Mar 28 05:59:39 2024 From: dlong at openjdk.org (Dean Long) Date: Thu, 28 Mar 2024 05:59:39 GMT Subject: RFR: 8324833: Signed integer overflows in ABS [v5] In-Reply-To: References: Message-ID: On Thu, 22 Feb 2024 08:53:24 GMT, Aleksey Shipilev wrote: >> See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) >> - [x] Linux x86_64 fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix a place where we cast to jlong after uabs Unfortunately, I'm now convinced this is the wrong approach and introduces bugs. I would rather see illegal inputs caught (JDK-8328934) and prevented on a case-by-case basis (see JDK-8329163 for example). ------------- Changes requested by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17617#pullrequestreview-1965226153 From shade at openjdk.org Thu Mar 28 08:15:40 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 28 Mar 2024 08:15:40 GMT Subject: RFR: 8324833: Signed integer overflows in ABS [v5] In-Reply-To: References: Message-ID: <5uhBtRvmLYTH1fJ663YeHTth59xxDGUroGSEjS7vh_Q=.e8cae513-d1f4-4779-934a-178b65d8215f@github.com> On Thu, 22 Feb 2024 08:53:24 GMT, Aleksey Shipilev wrote: >> See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) >> - [x] Linux x86_64 fastdebug, `all` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Fix a place where we cast to jlong after uabs All right, I am abandoning this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17617#issuecomment-2024637160 From shade at openjdk.org Thu Mar 28 08:15:41 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 28 Mar 2024 08:15:41 GMT Subject: Withdrawn: 8324833: Signed integer overflows in ABS In-Reply-To: References: Message-ID: On Mon, 29 Jan 2024 15:59:49 GMT, Aleksey Shipilev wrote: > See the details in the bug. I think current `ABS` implementation is beyond repair, and we should just switch to `uabs`. > > Additional testing: > - [x] Linux x86_64 fastdebug, `all` with `-ftrapv` (now fully passes!) > - [x] Linux x86_64 fastdebug, `all` This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/17617 From shade at openjdk.org Thu Mar 28 10:55:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 28 Mar 2024 10:55:34 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v2] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 14:55:35 GMT, Aleksey Shipilev wrote: >> We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. >> >> There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. >> >> Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. >> >> It also allows to remove the related Zero kludge. >> >> Additional testing: >> - [x] Linux AArch64 server fastdebug, `all` tests >> - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Review comments Thanks! I would wait for more reviews now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18500#issuecomment-2024904926 From rkennke at openjdk.org Thu Mar 28 11:12:36 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 28 Mar 2024 11:12:36 GMT Subject: RFR: 8329134: Reconsider TLAB zapping [v2] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 14:55:35 GMT, Aleksey Shipilev wrote: >> We zap the entire TLAB on initial allocation (`MemAllocator::mem_allocate_inside_tlab_slow`), and then also rezap the object contents when object is allocated from the TLAB (`ThreadLocalAllocBuffer::allocate`). The second part seems excessive, given the TLAB is already fully zapped. >> >> There is also no way to disable this zapping, like you would in other places with the relevant Zap* flags. >> >> Fixing both these issues allows to improve fastdebug tests performance, e.g. in jcstress. >> >> It also allows to remove the related Zero kludge. >> >> Additional testing: >> - [x] Linux AArch64 server fastdebug, `all` tests >> - [x] MacOS AArch64 Zero fastdebug, `bootcycle-images` pass > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Review comments Looks good to me. Thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18500#pullrequestreview-1965810416 From eosterlund at openjdk.org Thu Mar 28 14:02:31 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 28 Mar 2024 14:02:31 GMT Subject: RFR: 8328507: Move StackWatermark code from safepoint cleanup [v2] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 23:44:52 GMT, Coleen Phillimore wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove incorrect assert > > This makes sense. Thanks for the review, @coleenp and @xmas92! ------------- PR Comment: https://git.openjdk.org/jdk/pull/18378#issuecomment-2025256046 From eosterlund at openjdk.org Thu Mar 28 14:14:39 2024 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 28 Mar 2024 14:14:39 GMT Subject: Integrated: 8328507: Move StackWatermark code from safepoint cleanup In-Reply-To: References: Message-ID: On Tue, 19 Mar 2024 13:30:39 GMT, Erik ?sterlund wrote: > There is some stack watermark code in safepoint cleanup that flushes out concurrent stack processing, as a defensive measure to guard safepoint operations that assume the stacks have been fixed up. However, we wish to get rid of safepoint cleanup so this operation should move out from the safepoint. This proposal moves it into CollectedHeap::safepoint_synchronize_begin instead which runs before we start shutting down Java threads. This moves the code into more obvious GC places and reduces time spent inside of non-GC safepoints. This pull request has now been integrated. Changeset: aa595dbd Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/aa595dbda4e64d76eeaff941b2f1e1ef2414d7af Stats: 76 lines in 11 files changed: 23 ins; 45 del; 8 mod 8328507: Move StackWatermark code from safepoint cleanup Reviewed-by: aboldtch, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/18378 From jkern at openjdk.org Thu Mar 28 16:56:55 2024 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 28 Mar 2024 16:56:55 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc Message-ID: As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf ------------- Commit messages: - JDK-8329257 Changes: https://git.openjdk.org/jdk/pull/18536/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18536&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329257 Stats: 261 lines in 13 files changed: 21 ins; 208 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/18536.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18536/head:pull/18536 PR: https://git.openjdk.org/jdk/pull/18536 From jkern at openjdk.org Thu Mar 28 16:56:55 2024 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 28 Mar 2024 16:56:55 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:50:20 GMT, Joachim Kern wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 83: > 81: #error "xlc version not supported, macro __open_xl_version__ not found" > 82: #endif > 83: #endif // AIX This `#ifdef _AIX` might be obsolete, because configure will throw a warning if the compiler has a lower version, but it's only a warning. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1543307859 From jkern at openjdk.org Thu Mar 28 16:59:31 2024 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 28 Mar 2024 16:59:31 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:50:20 GMT, Joachim Kern wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 50: > 48: #undef malloc > 49: extern void *malloc(size_t) asm("vec_malloc"); > 50: #endif This `#if` is not needed if we are building on AIX 7.2 TL5 SP7 or higher. This is the case in our build environment, but I think IBM is still building with SP5, which would run into build errors if I remove this `#if` now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1543312492 From rehn at openjdk.org Thu Mar 28 17:15:35 2024 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 28 Mar 2024 17:15:35 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Mon, 25 Mar 2024 15:13:54 GMT, Andrew Haley wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> fix jni includes > >> > > But that raises an interesting question. What happens if you try to load a library compiled with `-march=armv8-a+sve` on a non-SVE system? Is the ELF flagged to require SVE so it will fail to load? I'm hoping this is the case -- if so, everything will work as it should in this PR, but I honestly don't know. (After spending like 10 years working with building, I still discover things I need to learn...). >> > >> > >> > I think we can handle it, when a jdk built with sve support runs on a non-sve machine, the sve related code will not be executed with the protection of UseSVE vm flag which is detected at runtime startup. >> >> First of all, my question was of a more general nature. Is there such a mechanism in the dynamic linker that protects us from loading libraries that will fail if an ISA extension is used that is missing on the running system? Or do the linker just check that the ELF is for the right CPU? > > That shouldn't matter to us. If the machine we're running on doesn't support what we need, why should we even try to open it? > >> Secondly, I assume that libsleef.so proper needs to be compiled with SVE support as well. So if we were to skip the shim vectormath library and load libsleef directly from hotspot, what would happen then? >> >> Thirdly, the check with UseSVE happens _after_ you load the library. > > At best that's a waste of time during startup. > 2. depends on external sleef at run time only, like @theRealAph suggested. Cons: requires sleef installed in system at runtime > 3. depends on external sleef at build time only, like @magicus suggested. Cons: requires sleef installed in system at build time It seem like everyone wants sleef as a "backend". As we have hard time, at least now, integrating sleef, and doing both or one above, doesn't interfere with such integration. And it doesn't hurt anyone if Hamlin is willing to put in the effort. I'd say go head with something like that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2025721135 From mdoerr at openjdk.org Thu Mar 28 17:36:31 2024 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 28 Mar 2024 17:36:31 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:53:39 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 83: > >> 81: #error "xlc version not supported, macro __open_xl_version__ not found" >> 82: #endif >> 83: #endif // AIX > > This `#ifdef _AIX` might be obsolete, because configure will throw a warning if the compiler has a lower version, but it's only a warning. I'd prefer having less AIX specific parts in this file. Can this be moved somewhere else? Or maybe combine it with the AIX code above? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1543371129 From psandoz at openjdk.org Thu Mar 28 18:43:36 2024 From: psandoz at openjdk.org (Paul Sandoz) Date: Thu, 28 Mar 2024 18:43:36 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4] In-Reply-To: <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> References: <1knXD7Wc8heH83BTJEguqmlTAa70oXDw_nWI0hBjAm0=.bcf88238-2b63-4d14-b3a3-94eabca69586@github.com> <77VcfKMeUXxKjoQ3zeEcBJfDQYhPRSt08OXMqg9rQDo=.de753f60-5568-4dec-9b86-2ba86acebfaa@github.com> Message-ID: On Fri, 15 Mar 2024 13:58:05 GMT, Hamlin Li wrote: >> Hi, >> Can you help to review this patch? >> Thanks >> >> This is a continuation of work based on [1] by @XiaohongGong, most work was done in that pr. In this new pr, just rebased the code in [1], then added some minor changes (renaming of calling method, add libsleef as extra lib in CI cross-build on aarch64 in github workflow); I aslo tested the combination of following scenarios: >> * at build time >> * with/without sleef >> * with/without sve support >> * at runtime >> * with/without sleef >> * with/without sve support >> >> [1] https://github.com/openjdk/jdk/pull/16234 >> >> ## Regression Test >> * test/jdk/jdk/incubator/vector/ >> * test/hotspot/jtreg/compiler/vectorapi/ >> >> ## Performance Test >> Previously, @XiaohongGong has shared the data: https://github.com/openjdk/jdk/pull/16234#issuecomment-1767727028 > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > fix jni includes Hamlin, thank you for working on this. I think integrating a sub-set of SLEEF is valuable (not all of it makes sense e.g., DFT part). My recommendation would be to focus on a PR that integrates the required source, rather taking steps towards that. AFAICT from browsing prior comments "integrate the source" appears to be the generally preferred solution, but there is some understandable hesitancy about legal aspects. IIUC from what you say this is a technically feasible and maintainable solution. As said here: > We (Oracle Java Platform Group) can handle the required "paperwork https://github.com/openjdk/jdk/pull/16234#issuecomment-1823335443 ------------- PR Comment: https://git.openjdk.org/jdk/pull/18294#issuecomment-2025878452 From cslucas at openjdk.org Thu Mar 28 19:56:34 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 28 Mar 2024 19:56:34 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 17:33:29 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: >> >> - Merge remote-tracking branch 'origin/master' into ram-nullables >> - Catching up with master >> - Fix broken build. >> - Merge with origin/master >> - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java >> >> Co-authored-by: Andrey Turbanov >> - Ammend previous fix & add repro tests. >> - Fix to prevent reducing already reduced Phi >> - Fix to prevent creating NULL ConNKlass constants. >> - Refrain from RAM of arrays and Phis controlled by Loop nodes. >> - Fix typo in test. >> - ... and 3 more: https://git.openjdk.org/jdk/compare/89e0889a...3129378f > > src/hotspot/share/opto/memnode.cpp line 1552: > >> 1550: // by 'split_through_phi'. The first use of this method was in EA code as part >> 1551: // of simplification of allocation merges. >> 1552: // Some differences from original method: > > What "original method" do you refer to here? The current method, `can_split_through_phi_base`, is just a helper that contains a copy of the validations performed by `split_through_phi`. The original method is `split_through_phi`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1543564696 From cslucas at openjdk.org Thu Mar 28 19:59:35 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 28 Mar 2024 19:59:35 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 20:52:09 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: >> >> - Merge remote-tracking branch 'origin/master' into ram-nullables >> - Catching up with master >> - Fix broken build. >> - Merge with origin/master >> - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java >> >> Co-authored-by: Andrey Turbanov >> - Ammend previous fix & add repro tests. >> - Fix to prevent reducing already reduced Phi >> - Fix to prevent creating NULL ConNKlass constants. >> - Refrain from RAM of arrays and Phis controlled by Loop nodes. >> - Fix typo in test. >> - ... and 3 more: https://git.openjdk.org/jdk/compare/89e0889a...3129378f > > src/hotspot/share/code/debugInfo.hpp line 136: > >> 134: Handle _value; >> 135: bool _visited; >> 136: bool _was_scalar_replaced; // Whether this ObjectValue describes an object scalar replaced or just > > Name it `is_scalar_replaced` maybe? Good catch ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1543569022 From cslucas at openjdk.org Thu Mar 28 20:03:34 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 28 Mar 2024 20:03:34 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 17:55:26 GMT, Vladimir Kozlov wrote: >> src/hotspot/share/opto/escape.cpp line 472: >> >>> 470: >>> 471: // Don't handle arrays. >>> 472: if (alloc->Opcode() != Op_Allocate) { >> >> Turn it into `alloc->Opcode() == Op_AllocateArray` or put `assert(alloc->Opcode() == Op_AllocateArray, "")` inside the branch. > > I suggest to add assert inside branch to catch the case when we add an other Allocate subclass. > > Also `ptn->ideal_node()->is_Allocate()` check at the line 468 is not needed because `as_Allocate()` has such assert check. Thanks. Fixed. Going up in next push. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1543572052 From cslucas at openjdk.org Thu Mar 28 20:03:37 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 28 Mar 2024 20:03:37 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v9] In-Reply-To: References: Message-ID: On Tue, 26 Mar 2024 20:49:48 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: >> >> - Merge remote-tracking branch 'origin/master' into ram-nullables >> - Catching up with master >> - Fix broken build. >> - Merge with origin/master >> - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java >> >> Co-authored-by: Andrey Turbanov >> - Ammend previous fix & add repro tests. >> - Fix to prevent reducing already reduced Phi >> - Fix to prevent creating NULL ConNKlass constants. >> - Refrain from RAM of arrays and Phis controlled by Loop nodes. >> - Fix typo in test. >> - ... and 3 more: https://git.openjdk.org/jdk/compare/89e0889a...3129378f > > src/hotspot/share/opto/escape.cpp line 2976: > >> 2974: // >> 2975: if (field->base_count() > 1 && candidates.size() == 0) { >> 2976: bool further_validate = false; > > A better name maybe? You can also extract its computation into a helper method (e.g. `has_non_reducible_merge(FieldNode* field)`. Makes sense. Going up in next git push. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15825#discussion_r1543572814 From cslucas at openjdk.org Thu Mar 28 20:20:01 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 28 Mar 2024 20:20:01 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: > ### Description > > Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. > > Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. > > The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. > > ### Benchmarking > > **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. > **Note 2:** Marging of error was negligible. > > | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | > |--------------------------------------|------------------|-------------------| > | TestTrapAfterMerge | 19.515 | 13.386 | > | TestArgEscape | 33.165 | 33.254 | > | TestCallTwoSide | 70.547 | 69.427 | > | TestCmpAfterMerge | 16.400 | 2.984 | > | TestCmpMergeWithNull_Second | 27.204 | 27.293 | > | TestCmpMergeWithNull | 8.248 | 4.920 | > | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | > | TestCondAfterMergeWithNull | 6.265 | 5.078 | > | TestCondLoadAfterMerge | 12.713 | 5.163 | > | TestConsecutiveSimpleMerge | 30.863 | 4.068 | > | TestDoubleIfElseMerge | 16.069 | 2.444 | > | TestEscapeInCallAfterMerge | 23.111 | 22.924 | > | TestGlobalEscape | 14.459 | 14.425 | > | TestIfElseInLoop | 246.061 | 42.786 | > | TestLoadAfterLoopAlias | 45.808 | 45.812 | > | TestLoadAfterTrap | 28.370 | ... Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Addressing Ivanov's PR feedback. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15825/files - new: https://git.openjdk.org/jdk/pull/15825/files/3129378f..455bae9d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=08-09 Stats: 42 lines in 6 files changed: 16 ins; 13 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/15825.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15825/head:pull/15825 PR: https://git.openjdk.org/jdk/pull/15825 From vlivanov at openjdk.org Thu Mar 28 20:26:35 2024 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Thu, 28 Mar 2024 20:26:35 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. Marked as reviewed by vlivanov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15825#pullrequestreview-1967207503 From amenkov at openjdk.org Thu Mar 28 22:18:38 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 28 Mar 2024 22:18:38 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation Message-ID: PreserveAllAnnotations option causes class file parser to preserve RuntimeInvisibleAnnotations so VM considers them as RuntimeVisibleAnnotations. For class retransformation JvmtiClassFileReconstituter restores all annotations as RuntimeVisibleAnnotations attributes. This can cause problem is the class contains only RuntimeInvisibleAnnotations, so corresponding RuntimeVisibleAnnotations attribute name is not present in the class constant pool. Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). The suggested fix adds workaround for this corner case - if "visible" attribute name is not in the CP, the annotations are restored with "invisible" attribute name. Testing: - tier1,tier2,hs-tier5-svc - all java/lang/instrument tests; - all RedefineClasses/RetransformClasses tests ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/18540/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18540&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328137 Stats: 33 lines in 3 files changed: 22 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/18540.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18540/head:pull/18540 PR: https://git.openjdk.org/jdk/pull/18540 From kvn at openjdk.org Thu Mar 28 23:16:33 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 28 Mar 2024 23:16:33 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. Update looks good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15825#pullrequestreview-1967522840 From jwaters at openjdk.org Fri Mar 29 05:27:31 2024 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 29 Mar 2024 05:27:31 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:50:20 GMT, Joachim Kern wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf That one singular build change looks good :) > The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf Did you mean compilerWarnings_gcc.hpp? ------------- Marked as reviewed by jwaters (Committer). PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1967860264 PR Comment: https://git.openjdk.org/jdk/pull/18536#issuecomment-2026674128 From sspitsyn at openjdk.org Fri Mar 29 07:11:31 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 29 Mar 2024 07:11:31 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: References: Message-ID: <96J31hsh_reOYgnV_R5vmM1ID17_YgQBuBLIYkmXWTA=.5e357beb-2d49-4719-aa03-4dfa577e0054@github.com> On Thu, 28 Mar 2024 22:12:49 GMT, Alex Menkov wrote: > Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). > > I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). This solution looks okay in general but still is a little bit confusing. It feels like saving just one bit would not add much complexity but would even simplify the implementation as it will become straight-forward. At least, there would be no need in this additional function with the `fallback_attr_name` parameter. ------------- PR Comment: https://git.openjdk.org/jdk/pull/18540#issuecomment-2026785793 From stuefe at openjdk.org Fri Mar 29 08:14:35 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 29 Mar 2024 08:14:35 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:50:20 GMT, Joachim Kern wrote: > As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). > Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. > This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. > The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf Hi @JoKern65 , this is a nice simplification! There are many casts that should be unnecessary (casting size_t to fit into SIZE_FORMAT) or that should use `p2i` (all those casts to fit pointers into PTR_FORMAT or INTPTR_FORMAT). I did not mark all of them. We also have a new macro for printing memory ranges, RANGEFMT and RANGEFMTARGS. Maybe some call sites could that instead and make the code shorter and better to read? But I leave that up to you. Cheers, and happy Easter! src/hotspot/os/aix/loadlib_aix.cpp line 120: > 118: (lm->is_in_vm ? '*' : ' '), > 119: (uintptr_t)lm->text, (uintptr_t)lm->text + lm->text_len, > 120: (uintptr_t)lm->data, (uintptr_t)lm->data + lm->data_len, Please don't cast, use `p2i()`. src/hotspot/os/aix/os_aix.cpp line 314: > 312: ErrnoPreserver ep; > 313: log_trace(os, map)("disclaim failed: " RANGEFMT " errno=(%s)", > 314: RANGEFMTARGS(p, (long)maxDisclaimSize), Wait, why are these casts needed? maxDisclaimSize is size_t, RANGEFMT uses SIZE_FORMAT. That should work without cast. src/hotspot/os/aix/os_aix.cpp line 651: > 649: lt.print("Thread is alive (tid: " UINTX_FORMAT ", kernel thread id: " UINTX_FORMAT > 650: ", stack [" PTR_FORMAT " - " PTR_FORMAT " (" SIZE_FORMAT "k using %luk pages)).", > 651: os::current_thread_id(), (uintx) kernel_thread_id, (uintptr_t)low_address, (uintptr_t)high_address, Use p2i, not cast src/hotspot/os/aix/os_aix.cpp line 1212: > 1210: st->print_cr("physical free : " SIZE_FORMAT, (unsigned long)mi.real_free); > 1211: st->print_cr("swap total : " SIZE_FORMAT, (unsigned long)mi.pgsp_total); > 1212: st->print_cr("swap free : " SIZE_FORMAT, (unsigned long)mi.pgsp_free); A better way to do this would be to change AIX::meminfo to use size_t. We should have done this when introducing that API. src/hotspot/os/aix/os_aix.cpp line 1399: > 1397: os->print("[" PTR_FORMAT " - " PTR_FORMAT "] (" UINTX_FORMAT > 1398: " bytes, %ld %s pages), %s", > 1399: (uintptr_t)addr, (uintptr_t)addr + size - 1, size, size / pagesize, describe_pagesize(pagesize), p2i src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 62: > 60: #include > 61: > 62: #if defined(LINUX) || defined(_ALLBSD_SOURCE) || defined(_AIX) What else is left? Could we just remove this line altogether now? src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 83: > 81: #error "xlc version not supported, macro __open_xl_version__ not found" > 82: #endif > 83: #endif // AIX Can probably be shortened like this: Suggestion: #ifdef _AIX #if !defined(__open_xl_version__) || (__open_xl_version__ < 17) #error "this xlc version is not supported" #endif #endif // AIX src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 103: > 101: #endif > 102: > 103: #if !defined(LINUX) && !defined(_ALLBSD_SOURCE) && !defined(_AIX) I believe this whole section can be removed now. At least I have no idea who this is for. What gcc versions does OpenJDK still support, then, beside these platforms. Also, any gcc platform not on linux or bsd would have hit the #error below at line 132. src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 128: > 126: #if defined(__APPLE__) > 127: inline int g_isnan(double f) { return isnan(f); } > 128: #elif defined(LINUX) || defined(_ALLBSD_SOURCE) || defined(_AIX) I think this can be just #else src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 131: > 129: inline int g_isnan(float f) { return isnan(f); } > 130: inline int g_isnan(double f) { return isnan(f); } > 131: #else and this can be removed ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18536#pullrequestreview-1967997963 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544171741 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544174303 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544176004 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544177911 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544181163 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544187429 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544207978 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544190654 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544191835 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544193002 From stuefe at openjdk.org Fri Mar 29 08:14:37 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 29 Mar 2024 08:14:37 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 16:57:00 GMT, Joachim Kern wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 50: > >> 48: #undef malloc >> 49: extern void *malloc(size_t) asm("vec_malloc"); >> 50: #endif > > This `#if` is not needed if we are building on AIX 7.2 TL5 SP7 or higher. This is the case in our build environment, but I think IBM is still building with SP5, which would run into build errors if I remove this `#if` now. While looking at this, I noticed that my question in https://github.com/openjdk/jdk/pull/14146#discussion_r1207078176 and followups had never been answered. Do you know the answers now? Quoting myself: > So, we do this only for malloc? Not for calloc, posix_memalign, realloc etc? What about free? > Removing that define and hard-coding it here assumes ... pointers it returns work with the unchanged free() and realloc() the system provides, and will always do so. > I am basically worried that undefining malloc, even if it seems harmless now, exposes us to difficult-to-investigate problems down the road, since it depends on how the libc devs will reform those macros in the future. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544199335 From stuefe at openjdk.org Fri Mar 29 08:14:36 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 29 Mar 2024 08:14:36 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: <8yq0NeIit-6Q3aB1IF3MqNZnT4B4hWd7LnKYlsX6NPk=.38fd4342-e553-4a12-8e69-ca95b3cccb09@github.com> On Fri, 29 Mar 2024 07:18:47 GMT, Thomas Stuefe wrote: >> As of [JDK-8325880](https://bugs.openjdk.org/browse/JDK-8325880), building the JDK requires version 17 of IBM Open XL C/C++ (xlc). This is in effect clang by another name, and it uses the clang toolchain in the JDK build. Thus the old xlc toolchain was removed by [JDK-8327701](https://bugs.openjdk.org/browse/JDK-8327701). >> Now we also switch the HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc, removing the last xlc rudiment. >> This means merging the AIX specific content of utilities/globalDefinitions_xlc.hpp and utilities/compilerWarnings_xlc.hpp into the corresponding gcc files on the on side and removing the defined(TARGET_COMPILER_xlc) blocks in the code, because the defined(TARGET_COMPILER_gcc) blocks work out of the box for the new AIX compiler. >> The rest of the changes are needed because of using utilities/compilerWarnings_xlc.hpp the compiler is much more nagging about ill formatted printf > > src/hotspot/os/aix/loadlib_aix.cpp line 120: > >> 118: (lm->is_in_vm ? '*' : ' '), >> 119: (uintptr_t)lm->text, (uintptr_t)lm->text + lm->text_len, >> 120: (uintptr_t)lm->data, (uintptr_t)lm->data + lm->data_len, > > Please don't cast, use `p2i()`. Check copyrights in this file and all others. Adapt SAP and Oracle copyrights. > src/hotspot/os/aix/os_aix.cpp line 651: > >> 649: lt.print("Thread is alive (tid: " UINTX_FORMAT ", kernel thread id: " UINTX_FORMAT >> 650: ", stack [" PTR_FORMAT " - " PTR_FORMAT " (" SIZE_FORMAT "k using %luk pages)).", >> 651: os::current_thread_id(), (uintx) kernel_thread_id, (uintptr_t)low_address, (uintptr_t)high_address, > > Use p2i, not cast Here, and in other places too where you cast a pointer to fit into PTR_FORMAT or INTPTR_FORMAT ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544172412 PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544181011 From stuefe at openjdk.org Fri Mar 29 08:14:37 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 29 Mar 2024 08:14:37 GMT Subject: RFR: JDK-8329257: AIX: Switch HOTSPOT_TOOLCHAIN_TYPE from xlc to gcc In-Reply-To: References: Message-ID: On Fri, 29 Mar 2024 07:56:10 GMT, Thomas Stuefe wrote: >> src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 50: >> >>> 48: #undef malloc >>> 49: extern void *malloc(size_t) asm("vec_malloc"); >>> 50: #endif >> >> This `#if` is not needed if we are building on AIX 7.2 TL5 SP7 or higher. This is the case in our build environment, but I think IBM is still building with SP5, which would run into build errors if I remove this `#if` now. > > While looking at this, I noticed that my question in https://github.com/openjdk/jdk/pull/14146#discussion_r1207078176 and followups had never been answered. Do you know the answers now? > > Quoting myself: > >> So, we do this only for malloc? Not for calloc, posix_memalign, realloc etc? What about free? >> Removing that define and hard-coding it here assumes ... pointers it returns work with the unchanged free() and realloc() the system provides, and will always do so. >> I am basically worried that undefining malloc, even if it seems harmless now, exposes us to difficult-to-investigate problems down the road, since it depends on how the libc devs will reform those macros in the future. Other than that, and kind of depending on your answer: How important is it that we catch every use of the original malloc? Can be safely mix the original malloc with vec_malloc if logging is not involved? I am asking, because from that it depends whether this hunk needs to appear right behind `#include ` or whether we can move it into the middle of the file together with the other AIX stuff. Because, if we move it into the middle of the file, we may miss any uses of malloc that may happen in system headers (would be unusual for that to happen but with IBM one never knows). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18536#discussion_r1544201412 From cslucas at openjdk.org Fri Mar 29 17:33:37 2024 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 29 Mar 2024 17:33:37 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v10] In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 20:20:01 GMT, Cesar Soares Lucas wrote: >> ### Description >> >> Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. >> >> Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. >> >> The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. >> >> ### Benchmarking >> >> **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. >> **Note 2:** Marging of error was negligible. >> >> | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | >> |--------------------------------------|------------------|-------------------| >> | TestTrapAfterMerge | 19.515 | 13.386 | >> | TestArgEscape | 33.165 | 33.254 | >> | TestCallTwoSide | 70.547 | 69.427 | >> | TestCmpAfterMerge | 16.400 | 2.984 | >> | TestCmpMergeWithNull_Second | 27.204 | 27.293 | >> | TestCmpMergeWithNull | 8.248 | 4.920 | >> | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | >> | TestCondAfterMergeWithNull | 6.265 | 5.078 | >> | TestCondLoadAfterMerge | 12.713 | 5.163 | >> | TestConsecutiveSimpleMerge | 30.863 | 4.068 | >> | TestDoubleIfElseMerge | 16.069 | 2.444 | >> | TestEscapeInCallAfterMerge | 23.111 | 22.924 | >> | TestGlobalEscape | 14.459 | 14.425 | >> | TestIfElseInLoop | 246.061 | 42.786 | >> | TestLoadAfterLoopAlias | 45.808 | 45.812 | >> ... > > Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: > > Addressing Ivanov's PR feedback. Thank you all for reviewing! I'll wait until Monday to "/integrate". ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-2027526787 From amenkov at openjdk.org Fri Mar 29 19:20:31 2024 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 29 Mar 2024 19:20:31 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: <96J31hsh_reOYgnV_R5vmM1ID17_YgQBuBLIYkmXWTA=.5e357beb-2d49-4719-aa03-4dfa577e0054@github.com> References: <96J31hsh_reOYgnV_R5vmM1ID17_YgQBuBLIYkmXWTA=.5e357beb-2d49-4719-aa03-4dfa577e0054@github.com> Message-ID: On Fri, 29 Mar 2024 07:08:58 GMT, Serguei Spitsyn wrote: > > Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). > > I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). > > This solution looks okay in general but still is a little bit confusing. It feels like saving just one bit would not add much complexity but would even simplify the implementation as it will become straight-forward. At least, there would be no need in this additional function with the `fallback_attr_name` parameter. It's a bit more complicated than storing 1 bit. Lets consider ClassFileParser::parse_classfile_record_attribute as an example. There are 2 type of annotations for records: RuntimeVisibleAnnotations and RuntimeVisibleTypeAnnotations. For each type if PreserveAllAnnotations is set ClassFileParser additionally handles invisible annotations (RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations). Then it "assembles" annotations (concatenates visible and invisible annotations into single array). To restore original annotations we need to keep number of visible annotations in the annotation array (so jvmtiClassReconstituter writes part of the annotation array as visible, and the rest as invisible). And we need to store this number for each annotation type (Annotations and TypeAnnotations for records, Annotations and TypeAnnotations for class, Annotations and TypeAnnotations for fields, Annotations/TypeAnnotations/ParameterAnnotations for methods). I considered another way to implement the workaround - add method `bool is_symbol_in_cp(const char* symbol)` But then each writing would become very verbose and more confusing (needs some explanation in each place): if (is_symbol_in_cp("RuntimeVisibleTypeAnnotations")) { write_annotations_attribute("RuntimeVisibleTypeAnnotations", component->type_annotations()); } else { write_annotations_attribute("RuntimeInvisibleTypeAnnotations", component->type_annotations()); } ------------- PR Comment: https://git.openjdk.org/jdk/pull/18540#issuecomment-2027647051 From sgibbons at openjdk.org Fri Mar 29 22:39:11 2024 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 29 Mar 2024 22:39:11 GMT Subject: RFR: 8329331: Intrinsify Unsafe::setMemory Message-ID: <5bNiITzJzFEdC6ARozUJBF2NCQaCLdHe_QwKIkcgwfU=.b87cab09-81b8-43f3-bf7a-e2b641881f9c@github.com> This code makes an intrinsic stub for `Unsafe::setMemory`. See [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around this change. Overall, making this an intrinsic improves overall performance of `Unsafe::setMemory` by up to 4x for all buffer sizes. Tested with tier-1 (and full CI). I've added a table of the before and after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) ------------- Commit messages: - Clean up code for PR - Fixed bug - incorrect interface to *_fill_entry - Address review comment - Restore intrinsic - Add benchmark - Test removing intrinsic - Removed setMemory1; debugged intrinsic code - Added actual intrinsic - Add Unsafe.setMemory as intrinsic Changes: https://git.openjdk.org/jdk/pull/18555/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18555&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329331 Stats: 362 lines in 21 files changed: 311 ins; 49 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18555.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18555/head:pull/18555 PR: https://git.openjdk.org/jdk/pull/18555 From sspitsyn at openjdk.org Fri Mar 29 23:11:31 2024 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 29 Mar 2024 23:11:31 GMT Subject: RFR: JDK-8328137: PreserveAllAnnotations can cause failure of class retransformation In-Reply-To: References: Message-ID: On Thu, 28 Mar 2024 22:12:49 GMT, Alex Menkov wrote: > PreserveAllAnnotations option causes class file parser to preserve RuntimeInvisibleAnnotations so VM considers them as RuntimeVisibleAnnotations. > For class retransformation JvmtiClassFileReconstituter restores all annotations as RuntimeVisibleAnnotations attributes. > This can cause problem is the class contains only RuntimeInvisibleAnnotations, so corresponding RuntimeVisibleAnnotations attribute name is not present in the class constant pool. > > Correct solution would be to store additional information about RuntimeInvisibleAnnotations and restore them exactly as they were in the original class (this should be done for all annotations: RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations for class, fields and records, RuntimeInvisibleAnnotations/RuntimeInvisibleTypeAnnotations/RuntimeInvisibleParameterAnnotations for methods; need to ensure the information is correctly updated during class redefinition & retransformation). > > I think it doesn't make sense to add all the complexity for almost no value (I doubt anyone uses PreserveAllAnnotations, the flag looks like experimental, we don't have any tests for it). > > The suggested fix adds workaround for this corner case - if "visible" attribute name is not in the CP, the annotations are restored with "invisible" attribute name. > > Testing: > - tier1,tier2,hs-tier5-svc > - all java/lang/instrument tests; > - all RedefineClasses/RetransformClasses tests Thank you for explaining, Alex. I see the complexity of the "simple" solution. :) My guess is that all workarounds are going to be ugly, so I'd suggest to use the simplest one (your initial version). ------------- PR Comment: https://git.openjdk.org/jdk/pull/18540#issuecomment-2027811158 From kvn at openjdk.org Sat Mar 30 01:48:47 2024 From: kvn at openjdk.org (Vladimir Kozlov) Date: Sat, 30 Mar 2024 01:48:47 GMT Subject: RFR: 8329332: Remove CompiledMethod and CodeBlobLayout classes Message-ID: Revert [JDK-8152664](https://bugs.openjdk.org/browse/JDK-8152664) RFE [changes](https://github.com/openjdk/jdk/commit/b853eb7f5ca24eeeda18acbb14287f706499c365) which was used for AOT [JEP 295](https://openjdk.org/jeps/295) implementation in JDK 9. The code was left in HotSpot assuming it will help in a future. But during work on Leyden we decided to not use it. In Leyden cached compiled code will be restored in CodeCache as normal nmethods: no need to change VM's runtime and GC code to process them. I may work on optimizing `CodeBlob` and `nmethod` fields layout to reduce header size in separate changes. In these changes I did simple fields reordering to keep small (1 byte) fields together. I do not see (and not expected) performance difference with these changes. Tested tier1-5, xcomp, stress. Running performance testing. I need help with testing on platforms which Oracle does not support. ------------- Commit messages: - remove trailing whitespace - 8329332: Remove CompiledMethod and CodeBlobLayout classes Changes: https://git.openjdk.org/jdk/pull/18554/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18554&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8329332 Stats: 3885 lines in 118 files changed: 1281 ins; 1723 del; 881 mod Patch: https://git.openjdk.org/jdk/pull/18554.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18554/head:pull/18554 PR: https://git.openjdk.org/jdk/pull/18554